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Streptococcus pneumoniae Polynucleotides and Sequences 
FIELD OF THE INVENTION 

5 The present invention relates to the field of molecular biology. In 

particular, it relates to, among other things, nucleotide sequences of Streptococcus 
pneumoniae, contigs, ORFs, fragments, probes, primers and related 
polynucleotides thereof, peptides and polypeptides encoded by the sequences, and 
uses of the polynucleotides and sequences thereof, such as in fermentation, 
10 polypeptide production, assays and pharmaceutical development, among others. 

BACKGROUND OF THE INVENTION 

Streptococcus pneumoniae has been one of the most extensively studied 

15 microorganisms since its first isolation in 188 L It was the object of many 
investigations that led to important scientific discoveries. In 1928, Griffith 
observed that when heat-killed encapsulated pneumococci and live strains 
constitutively lacking any capsule were concomitantly injected into mice, the 
nonencapsulated could be converted into encapsulated pneumococci with the same 

20 capsular type as the heat-killed strain. Years later, the nature of this "transforming 
principle," or carrier of genetic information, was shown to be DNA. (Avery, O.T., 
et aU / Exp. Med., 79:137-157 (1944)). 

In spite of the vast number of publications on S. pneumoniae many 
questions about its virulence are sdll unanswered, and this pathogen remains a 

25 major causative agent of serious human disease, especially community-acquired 
pneumonia. (Johnston, R.B., et ai, Rev. Infect Dis. 7J(Suppl. 6):S509-517 
( 1991)). In addition, in developing countries, the pneumococcus is responsible for 
the death of a large number of children under the age of 5 years from pneumococcal 
pneumonia. The incidence of pneumococcal disease is highest in infants under 2 

30 years of age and in people over 60 years of age. Pneumococci are the second most 
frequent cause (after Haemophilus influenzae type b) of bacterial meningitis and 
otitis media in children. With the recent introduction of conjugate vaccines for H. 
influenzae type b, pneumococcal meningitis is likely to become increasingly 
prominent. S. pneumoniae is the most important etiologic agent of community- 
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acquired pneumonia in adults and is the second most common cause of bacterial 
meningitis behind Neisseria meningitidis. 

The antibiotic generally prescribed to treat S. pneumoniae is 
benzylpenicillin, although resistance to this and to other antibiotics is found 

5 occasionally. Pneumococcal resistance to penicillin results from mutations in its 
penicillin-binding proteins. In uncomplicated pneumococcal pneumonia caused by 
a sensitive strain, treatment with penicillin is usually successful unless started too 
late. Erythromycin or clindamycin can be used to treat pneumonia in patients 
hypersensitive to penicillin, but resistant strains to these drugs exist. Broad 

10 spectrum antibiotics (e.g., the tetracyclines) may also be effective, although 
tetracycline-resistant strains are not rare. In spite of the availability of antibiotics, 
the mortality of pneumococcal bacteremia in the last four decades has remained 
stable between 25 and 29%. (Gillespie, S.H., et ai. J. Med Microbiol. 28:231- 
248 (1989). 

15 5. pneumoniae is carried in the upper respiratory tract by many healthy 

individuals. It has been suggested that attachment of pneumococci is mediated by a 
disaccharide receptor on fibronectin, present on human pharyngeal epithelial cells. 
(Anderson, &.]. y etal.J. Immunol 742:2464-2468 (1989). The mechanisms by 
which pneumococci translocate from the nasopharynx to the lung, thereby causing 

20 \ pneumonia, or migrate to the blood, giving rise to bacteremia or septicemia, are 
poorly understood. (Johnston, R.B., et ai, Rev. Infect. Dis. 73(Suppl. 6):S509- 
517(1991). 

Various proteins have been suggested to be involved in the pathogenicity of 
5. pneumoniae, however, only a few of them have actually been confirmed as 

25 virulence factors. Pneumococci produce an IgAl protease that might interfere with 
host defense at mucosal surfaces. (Kornfield, S.J., etai.Rev. Inf. Dis. 3:521- 
534 (1981). 5. pneumoniae also produces neuraminidase, an enzyme that may 
facilitate attachment to epithelial cells by cleaving sialic acid from the host 
glycolipids and gangliosides. Partially purified neuraminidase was observed to 

30 induce meningitis-like symptoms in mice; however, the reliability of this finding 
has been questioned because the neuraminidase preparations used were probably 
contaminated with cell wall products. Other pneumococcal proteins besides 
neuraminidase are involved in the adhesion of pneumococci to epithelial and 
endothelial cells. These pneumococcal proteins have as yet not been identified. 

35 Recently, Cundell et- al. , reported that peptide permeases can modulate 
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pneumococcal adherence to epithelial and endothelial cells. It was, however, 
unclear whether these permeases function directly as adhesions or whether they 
enhance adherence by modulating the expression of pneumococcal adhesions. 
(DeVelasco, E.A., ex ai y Micro. Rev. 59:591-603 (1995). A better understanding 

5 of the virulence factors determining its pathogenicity will need to be developed to 
cope with the devastating effects of pneumococcal disease in humans. 

Ironically, despite the prominent role of S. pneumoniae in the discovery of 
DNA, little is known about the molecular genetics of the organism. The 5. 
pneumoniae genome consists of one circular, covalently closed, double-stranded 

10 DNA and a collection of so-called variable accessory elements, such as prophages, 
plasrnids, transposons and the like. Most physical characteristics and almost all of 
the genes of 5. pneumoniae are unknown. Among the few that have been 
identified, most have not been physically mapped or characterized in detail. Only a 
few genes of this organism have been sequenced. (See, for instance current 

15 versions of GENBANK and other nucleic acid databases, and references that relate 
to the genome of S. pneumoniae such as those set out elsewhere herein.) 

It is clear that the etiology of diseases mediated or exacerbated by S. 
pneumoniae, infection involves the programmed expression of S. pneumoniae 
genes, and that characterizing the genes and their patterns of expression would add 

20 dramatically to our understanding of the organism and its host interactions. 
Knowledge of S. pneumoniae genes and genomic organization would improve our 
understanding of disease etiology and lead to improved and new ways of 
preventing, ameliorating, arresting and reversing diseases. Moreover, 
characterized genes and genomic fragments of 5. pneumoniae would provide 

25 reagents for, among other things, detecting, characterizing and controlling 5. 
pneumoniae infections. There is a need to characterize the genome of S. 
pneumoniae and for polynucleotides of this organism. 
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SUMMARY OF THE INVENTION 

The present invention is based on the sequencing of fragments of the 

5 Streptococcus pneumoniae genome. The primary nucleotide sequences which were 
generated are provided in SEQ ID NOS: 1 -39 1 . 

The present invention provides the nucleotide sequence of several hundred 
contigs of the Streptococcus pneumoniae genome, which are listed in tables below 
and set out in the Sequence Listing submitted herewith, and representative 

10 fragments thereof, in a form which can be readily used, analyzed, and interpreted 
by a skilled artisan. In one embodiment, the present invention is provided as 
contiguous strings of primary sequence information corresponding to the 
nucleotide sequences depicted in SEQ ID NOS: 1-391. 

The present invention further provides nucleotide sequences which are at 

1 5 least 95% identical to the nucleotide sequences of SEQ ID NOS: I -39 1 . 

The nucleotide sequence of SEQ ID NOS: 1-391 , a representative fragment 
thereof, or a nucleotide sequence which is at least 95% identical to the nucleotide 
sequence of SEQ ID NOS: 1-391 may be provided in a variety of mediums to 
facilitate its use. In one application of this embodiment, the sequences of the 

20 present invention are recorded on computer readable media. Such media includes, 
but is not limited to: magnetic storage media, such as floppy discs, hard disc 
storage medium, and magnetic tape; optical storage media such as CD-ROM; 
electrical storage media such as RAM and ROM; and hybrids of these categories 
such as magnetic/optical storage media. 

25 The present invention further provides systems, particularly computer- 

based systems which contain the sequence information herein described stored in a 
data storage means. Such systems are designed to identify commercially important 
fragments of the Streptococcus pneumoniae genome. 

Another embodiment of the present invention is directed to fragments of the 

30 Streptococcus pneumoniae genome having particular structural or functional 
attributes. Such fragments of the Streptococcus pneumoniae genome of the present 
invention include, but are not limited to, fragments which encode peptides, 
hereinafter referred to as open reading frames or ORFs, fragments which modulate 
the expression of an operably linked ORF, hereinafter referred to as expression 

35 modulating fragments or EMFs, and fragments which can be used to diagnose the 
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presence of Streptococcus pneumoniae in a sample, hereinafter referred to as 
diagnostic fragments or DFs. 

Each of the ORFs in fragments of the Streptococcus pneumoniae genome 
disclosed in Tables 1-3, and the EMFs found 5' to the ORFs, can be used in 
5 numerous ways as polynucleotide reagents. For instance, the sequences can be 
used as diagnostic probes or amplification primers for detecting or determining the 
presence of a specific microbe in a sample, to selectively control gene expression in 
a host and in the production of polypeptides, such as polypeptides encoded by 
ORFs of the present invention, particular those polypeptides that have a 

1 0 pharmacological activity. 

The present invention further includes recombinant constructs comprising 
one or more fragments of the Streptococcus pneumoniae genome of the present 
invention. The recombinant constructs of the present invention comprise vectors, 
such as a plasmid or viral vector, into which a fragment of the Streptococcus 

] 5 pneumoniae has been inserted. 

The present invention further provides host cells containing any of the 
isolated fragments of the Streptococcus pneumoniae genome of the present 
invention. The host cells can be a higher eukaryotic host cell, such as a mammalian 
cell, a lower eukaryotic cell, such as a yeast cell, or a procaryotic cell such as a 

20 \ bacterial cell. 

The present invention is further directed to isolated polypeptides and 
proteins encoded by ORFs of the present invention. A variety of methods, well 
known to those of skill in the art, routinely may be utilized to obtain any of the 
polypeptides and proteins of the present invention. For instance, polypeptides and 

25 proteins of the present invention having relatively short, simple amino acid 
sequences readily can be synthesized using commercially available automated 
peptide synthesizers. Polypeptides and proteins of the present invention also may 
be purified from bacterial cells which naturally produce the protein. Yet another 
alternative is to purify polypeptide and proteins of the present invention from cells 

30 which have been altered to express them. 

The invention further provides methods of obtaining homoiogs of the 
fragments of the Streptococcus pneumoniae genome of the present invention and 
homoiogs of the proteins encoded by the ORFs of the present invention. 
Specifically, by using the nucleotide and amino acid sequences disclosed herein as 
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a probe or as primers, and techniques such as PCR cloning and colony/plaque 
hybridization, one skilled in the art can obtain homologs. 

The invention further provides antibodies which selectively bind 
polypeptides and proteins of the present invention. Such antibodies include both 
5 monoclonal and polyclonal antibodies. 

The invention further provides hybridomas which produce the above- 
described antibodies. A hybridoma is an immortalized ceil line which is capable of 
secreting a specific monoclonal antibody. 

The present invention further provides methods of identifying test samples 
10 derived from cells which express one of the ORFs of the present invention, or a 
homolog thereof. Such methods comprise incubating a test sample with one or 
more of the antibodies of the present invention, or one or more of the DFs of the 
present invention, under conditions which allow a skilled artisan to determine if the 
sample contains the ORF or product produced therefrom. 
15 In another embodiment of the present invention, kits are provided which 

contain the necessary reagents to carry out the above-described assays. 

Specifically, the invention provides a compartmentalized kit to receive, in 
close confinement, one or more containers which comprises: (a) a first container 
comprising one of the antibodies, or one of the DFs of the present invention; and 
20 (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of bound antibodies or hybridized 
DFs. 

Using the isolated proteins of the present invention, the present invention 
further provides methods of obtaining and identifying agents capable of binding to 

25 a polypeptide or protein encoded by one of the ORFs of the present invention. 
Specifically, such agents include, as further described below, antibodies, peptides, 
carbohydrates, pharmaceutical agents and the like. Such methods comprise steps 
of: (a) contacting an agent with an isolated protein encoded by one of the ORFs of 
the present invention; and (b) determining whether the agent binds to said protein. 

30 The present genomic sequences of Streptococcus pneumoniae will be of 

great value to all laboratories working with this organism and for a variety of 
commercial purposes. Many fragments of the Streptococcus pneumoniae genome 
will be immediately identified by similarity searches against GenBank or protein 
databases and will be of immediate value to Streptococcus pneumoniae researchers 
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and for immediate commercial value for the production of proteins or to control 
gene expression. 

The methodology and technology for elucidating extensive genomic 
sequences of bacterial and other genomes has and will greatly enhance the ability to 

5 analyze and understand chromosomal organization. In particular, sequenced 
contigs and genomes will provide the models for developing tools for the analysis 
of chromosome structure and function, including the ability to identify genes within 
large segments of genomic DNA, the structure, position, and spacing of regulatory 
elements, the identification of genes with potential industrial applications, and the 

10 ability to do comparative genomic and molecular phylogeny. 

DESCRIPTION OF THE FIGURES 

FIGURE 1 is a block diagram of a computer system (102) that can be 
15 used to implement computer-based systems of present invention. 

FIGURE 2 is a schematic diagram depicting the data flow and computer 
programs used to collect, assemble, edit and annotate the contigs of the 
Streptococcus pneumoniae genome of the present invention. Both Macintosh and 

20 Unix platforms are used to handle the AB 373 and 377 sequence data files, largely 
as described in Kerlavage et ai , Proceedings of the Twenty-Sixth Annual Hawaii 
International Conference on System Sciences, 585, IEEE Computer Society Press, 
Washington D.C. (1993). Factura (AB) is a Macintosh program designed for 
automatic vector sequence removal and end-trimming of sequence files. The 

25 program Loadis runs on a Macintosh platform and parses the feature data extracted 
from the sequence Files by Factura to the Unix based Streptococcus pneumoniae 
relational database. Assembly of contigs (and whole genome sequences) is 
accomplished by retrieving a specific set of sequence files and their associated 
features using Extrseq, a Unix utility for retrieving sequences from an SQL 

30 database. The resulting sequence file is processed by seq_filter to trim portions of 
the sequences with more than 2% ambiguous nucleotides. The sequence files were 
assembled using TIGR Assembler, an assembly engine designed at The Institute 
for Genomic Research ( TIGR ) for rapid and accurate assembly of thousands of 
sequence fragments. The collection of contigs generated by the assembly step is 

35 loaded into the database with the lassie program. Identification of open reading 
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frames (ORFs) is accomplished by processing contigs with zorf or GenMark. The 
ORFs are searched against S. pneumoniae sequences from GenBank and against all 
protein sequences using the BLASTN and BLASTP programs, described in 
Altschul et al. y J. MoL Biol. 215: 403-410 (1990)). Results of the ORF 
5 determination and similarity searching steps were loaded into the database. As 
described below, some results of the determination and the searches are set out in 
Tables 1-3. 

DETAILED DESC RIPTION OF ILLUSTRATIVE EMBODIMENTS 

10 

The present invention is based on the sequencing of fragments of the 
Streptococcus pneumoniae genome and analysis of the sequences. The primary 
nucleotide sequences generated by sequencing the fragments are provided in SEQ 
ED NOS: 1-391. (As used herein, the "primary sequence" refers to the nucleotide 

15 sequence represented by the IUPAC nomenclature system.) 

In addition to the aforementioned Streptococcus pneumoniae polynucleotide 
and polynucleotide sequences, the present invention provides the nucleotide 
sequences of SEQ ID NOS: 1-391, or representative fragments thereof, in a form 
which can be readily used, analyzed, and interpreted by a skilled artisan. 

20 As used herein, a "representative fragment of the nucleotide sequence 

depicted in SEQ ED NOS: 1-391" refers to any portion of the SEQ ID NOS: 1-391 
which is not presently represented within a publicly available database. Preferred 
representative fragments of the present invention are Streptococcus pneumoniae 
open reading frames ( ORFs ), expression modulating fragment ( EMFs ) and 

25 fragments which can be used to diagnose the presence of Streptococcus 
pneumoniae in sample ( DFs ). A non-limiting identification of preferred 
representative fragments is provided in Tables 1-3. As discussed in detail below, 
the information provided in SEQ ED NOS: 1-39 1 and in Tables 1-3 together with 
routine cloning, synthesis, sequencing and assay methods will enable those skilled 

30 in the art to clone and sequence all "representative fragments" of interest, including 
open reading frames encoding a large variety of Streptococcus pneumoniae 
proteins. 

While the presently disclosed sequences of SEQ ID NOS: 1-391 are highly 
accurate, sequencing techniques are not perfect and, in relatively rare instances, 
35 further investigation of a fragment or sequence of the invention may reveal a 
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nucleotide sequence error present in a nucleotide sequence disclosed in SEQ ID 
NOS:l-391. However, once the present invention is made available {i.e., once the 
information in SEQ ID NOS: 1-391 and Tables 1-3 has been made available), 
resolving a rare sequencing error in SEQ ID NOS: 1-391 will be well within the 

5 skill of the art. The present disclosure makes available sufficient sequence 
information to allow any of the described contigs or portions thereof to be obtained 
readily by straightforward application of routine techniques. Further sequencing of 
such polynucleotide may proceed in like manner using manual and automated 
sequencing methods which are employed ubiquitous in the art. Nucleotide 

10 sequence editing software is publicly available. For example, Applied Biosystem's 
(AB) AutoAssembler can be used as an aid during visual inspection of nucleotide 
sequences. By employing such routine techniques potential errors readily may be 
identified and the correct sequence then may be ascertained by targeting further 
sequencing effort, also of a routine nature, to the region containing the potential 

15 error. 

Even if all of the very rare sequencing errors in SEQ ID NOS: 1-391 were 
corrected, the resulting nucleotide sequences would still be at least 95% identical, 
nearly all would be at least 99% identical, and the great majority would be at least 
99.9% identical to the nucleotide sequences of SEQ ID NOS: 1-391. 

20 As discussed elsewhere herein, polynucleotides of the present invention 

readily may be obtained by routine application of well known and standard 
procedures for cloning and sequencing DNA. Detailed methods for obtaining 
libraries and for sequencing are provided below, for instance. A wide variety of 
Streptococcus pneumoniae strains that can be used to prepare 5. pneumoniae 

25 genomic DNA for cloning and for obtaining polynucleotides of the present 
invention are available to the public from recognized depository institutions, such 
as the American Type Culture Collection ( ATCC ). While the present invention is 
enabled by the sequences and other information herein disclosed, the S. 
pneumoniae strain that provided the DNA of the present Sequence Listing, Strain 

30 7/87 14.8.91, has been deposited in the ATCC, as a convenience to those of skill 
in the art. As a further convenience, a library of S. pneumoniae genomic DNA, 
derived from the same strain, also has been deposited in the ATCC. The 5. 
pneumoniae strain was deposited on October 10, 1996, and was given Deposit No. 
55840, and the cDNA library was deposited on October 11, 1996 and was given 

35 Deposit No. 97755. The genomic fragments in the library are 15 to 20 kb 
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fragments generated by partial Sau3Al digestion and they are inserted into the 
BamHI site in the well-known lambda-derived vector lambda DASH II (Stratagene, 
La Jolla, CA). The provision of the deposits is not a waiver of any rights of the 
inventors or their assignees in the present subject matter. 
5 The nucleotide sequences of the genomes from different strains of 

Streptococcus pneumoniae differ somewhat. However, the nucleotide sequences 
of the genomes of all Streptococcus pneumoniae strains will be at least 95% 
identical, in corresponding pan, to the nucleotide sequences provided in SEQ ID 
NOS: 1-391. Nearly all will be at least 99% identical and the great majority will be 

10 99.9% identical. 

Thus, the present invention further provides nucleotide sequences which 
are at least 95%, preferably 99% and most preferably 99.9% identical to the 
nucleotide sequences of SEQ ID NOS:l-391, in a form which can be readily used, 
analyzed and interpreted by the skilled artisan. 

15 Methods for determining whether a nucleotide sequence is at least 95%, at 

least 99% or at least 99.9% identical to the nucleotide sequences of SEQ ID 
NOS: 1-391 are routine and readily available to the skilled artisan. For example, the 
well known fasta algorithm described in Pearson and Lipman, Proc. Natl. Acad 
ScL USA 85: 2444 (1988) can be used to generate the percent identity of nucleotide 

20 sequences. The BLASTN program also can be used to generate an identity score 
of polynucleotides compared to one another. 

COMPUTER RELATED EMBODIMENTS 

The nucleotide sequences provided in SEQ ID NOS: 1-391, a representative 
25 fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% 
and most preferably at least 99.9% identical to a polynucleotide sequence of SEQ 
ID NOS: 1-391 may be "provided" in a variety of mediums to facilitate use thereof. 
As used herein, provided refers to a manufacture, other than an isolated nucleic 
acid molecule, which contains a nucleotide sequence of the present invention; i.e., 
30 a nucleotide sequence provided in SEQ ID NOS: 1-391, a representative fragment 
thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most 
preferably at least 99.9% identical to a polynucleotide of SEQ ID NOS:l-391. 
Such a manufacture provides a large portion of the Streptococcus pneumoniae 
genome and parts thereof (e.g., a Streptococcus pneumoniae open reading frame 
35 (ORF)) in a form which allows a skilled artisan to examine the manufacture using 



WO 98/18931 



11 



PCT/US97/19588 



means not directly applicable to examining the Streptococcus pneumoniae genome 
or a subset thereof as it exists in nature or in purified form. 

In one application of this embodiment, a nucleotide sequence of the present 
invention can be recorded on computer readable media. As used herein, "computer 

5 readable media" refers to any medium which can be read and accessed direcdy by a 
computer. Such media include, but are not limited to: magnetic storage media, 
such as floppy discs, hard disc storage medium, and magnetic tape; optical storage 
media such as CD- ROM; electrical storage media such as RAM and ROM; and 
hybrids of these categories, such as magnetic/optical storage media. A skilled 

10 artisan can readily appreciate how any of the presently known computer readable 
mediums can be used to create a manufacture comprising computer readable 
medium having recorded thereon a nucleotide sequence of the present invention. 
Likewise, it will be clear to those of skill how additional computer readable media 
that may be developed also can be used to create analogous manufactures having 

1 5 recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on 
computer readable medium. A skilled artisan can readily adopt any of the presently 
know methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present 

20 invention. A variety of data storage structures are available to a skilled artisan 
for creating a computer readable medium having recorded thereon a nucleotide 
sequence of the present invention. The choice of the data storage structure will 
generally be based on the means chosen to access the stored information. In 
addition, a variety of data processor programs and formats can be used to store the 

25 nucleotide sequence information of the present invention on computer readable 
medium. The sequence information can be represented in a word processing text 
file, formatted in commercially- available software such as WordPerfect and 
Microsoft Word, or represented in the form of an ASCII file, stored in a database 
application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily 

30 adapt any number of data-processor structuring formats (e.g., text file or database) 
in order to obtain computer readable medium having recorded thereon the 
nucleotide sequence information of the present invention. 

Computer software is publicly available which allows a skilled artisan to 
access sequence information provided in a computer readable medium. Thus, by 

35 providing in computer readable form the nucleotide sequences of SEQ ID NOS:l- 
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391, a representative fragment thereof, or a nucleotide sequence at least 95%, 
preferably at least 99% and most preferably at least 99.9% identical to a sequence 
of SEQ ID NOS: 1-391 the present invention enables the skilled artisan routinely to 
access the provided sequence information for a wide variety of purposes. 
5 The examples which follow demonstrate how software which implements 

the BLAST (Altschul et aU J. Mol Biol 275:403-410 (1990)) and BLAZE 
(Brutlag et a/., Comp. Chem. 77:203-207 (1993)) search algorithms on a Sybase 
system was used to identify open reading frames (ORFs) within the Streptococcus 
pneumoniae genome which contain homology to ORFs or proteins from both 

10 Streptococcus pneumoniae and from other organisms. Among the ORFs discussed 
herein are protein encoding fragments of the Streptococcus pneumoniae genome 
useful in producing commercially important proteins, such as enzymes used in 
fermentation reactions and in the production of commercially useful metabolites. 

The present invention further provides systems, particularly computer- 

15 based systems, which contain the sequence information described herein. Such 
systems are designed to identify, among other things, commercially important 
fragments of the Streptococcus pneumoniae genome. 

As used herein, "a computer-based system" refers to the hardware means, 
software means, and data storage means used to analyze the nucleotide sequence 

20 information of the present invention. The minimum hardware means of the 
computer-based systems of the present invention comprises a central processing 
unit (CPU), input means, output means, and data storage means. A skilled artisan 
can readily appreciate that any one of the currently available computer-based 
systems are suitable for use in the present invention. 

25 As stated above, the computer-based systems of the present invention 

comprise a data storage means having stored therein a nucleotide sequence of the 
present invention and the necessary hardware means and software means for 
supporting and implementing a search means. 

As used herein, "data storage means 1 ' refers to memory which can store 

30 nucleotide sequence information of the present invention, or a memory access 
means which can access manufactures having recorded thereon the nucleotide 
sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are 
implemented on the computer-based system to compare a target sequence or target 

35 structural motif with the sequence information stored within the data storage 
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means. Search means are used to identify fragments or regions of the present 
genomic sequences which match a particular target sequence or target motif. A 
variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the 
5 computer-based systems of the present invention. Examples of such software 
includes, but is not limited to, MacPattem (EMBL), BLASTN and BLASTX 
(NCBIA). A skilled artisan can readily recognize that any one of the available 
algorithms or implementing software packages for conducting homology searches 
can be adapted for use in the present computer-based systems. 

10 As used herein, a "target sequence" can be any DNA or amino acid 

sequence of six or more nucleotides or two or more amino acids. A skilled artisan 
can readily recognize that the longer a target sequence is, the less likely a target 
sequence will be present as a random occurrence in the database. The most 
preferred sequence length of a target sequence is from about 10 to 100 amino acids 

15 or from about 30 to 300 nucleotide residues. However, it is well recognized that 
searches for commercially important fragments, such as sequence fragments 
involved in gene expression and protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any 
rationally selected sequence or combination of sequences in which the sequence(s) 

20 are chosen based on a three-dimensional configuration which is formed upon the 
folding of the target motif. There are a variety of target motifs known in the art. 
Protein target motifs include, but are not limited to, enzymic active sites and signal 
sequences. Nucleic acid target motifs include, but are not limited to, promoter 
sequences, hairpin structures and inducible expression elements (protein binding 

25 sequences). 

A variety of structural formats for the input and output means can be used 
to input and output the information in the computer-based systems of the present 
invention. A preferred format for an output means ranks fragments of the 
Streptococcus pneumoniae genomic sequences possessing varying degrees of 

30 homology to the target sequence or target motif. Such presentation provides a 
skilled artisan with a ranking of sequences which contain various amounts of the 
target sequence or target motif and identifies the degree of homology contained in 
the identified fragment. 

A variety of comparing means can be used to compare a target sequence or 

35 target motif with the data storage means to identify sequence fragments of the 
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Streptococcus pneumoniae genome. In the present examples, implementing 
software which implement the BLAST and BLAZE algorithms, described in 
Altschul et al, / MoL Biol. 215: 403-410 (1990), is used to identify open reading 
frames within the Streptococcus pneumoniae genome. A skilled artisan can readily 
5 recognize that any one of the publicly available homology search programs can be 
used as the search means for the computer-based systems of the present invention. 
Of course, suitable proprietary systems that may be known to those of skill also 
may be employed in this regard. 

Figure 1 provides a block diagram of a computer system illustrative of 

10 embodiments of this aspect of present invention. The computer system 102 
includes a processor 106 connected to a bus 104. Also connected to the bus 104 
are a main memory 108 (preferably implemented as random access memory, RAM) 
and a variety of secondary storage devices 110, such as a hard drive 112 and a 
removable medium storage device 1 14. The removable medium storage device 1 14 

15 may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape 
drive, etc. A removable storage medium 116 (such as a floppy disk, a compact 
disk, a magnetic tape, etc.) containing control logic and/or data recorded therein 
may be inserted into the removable medium storage device 114. The computer 
system 102 includes appropriate software for reading the control logic and/or the 

20 data from the removable medium storage device 114, once it is inserted into the 
removable medium storage device 1 14. 

A nucleotide sequence of the present invention may be stored in a well 
known manner in the main memory 108, any of the secondary storage devices 1 10, 
and/or a removable storage medium 1 1 6. During execution, software for accessing 

25 and processing the genomic sequence (such as search tools, comparing tools, etc.) 
reside in main memory 108, in accordance with the requirements and operating 
parameters of the operating system, the hardware system and the software program 
or programs. 
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BIOCHEMICAL EMBODIMENTS 

Other embodiments of the present invention are directed to isolated 
fragments of the Streptococcus pneumoniae genome. The fragments of the 
5 Streptococcus pneumoniae genome of the present invention include, but are not 
limited to fragments which encode peptides and polypeptides, hereinafter open 
reading frames (ORFs), fragments which modulate the expression of an operably 
linked ORF, hereinafter expression modulating fragments (EMFs) and fragments 
which can be used to diagnose the presence of Streptococcus pneumoniae in a 

10 sample, hereinafter diagnostic fragments (DFs). 

As used herein, an "isolated nucleic acid molecule" or an "isolated fragment 
of the Streptococcus pneumoniae genome" refers to a nucleic acid molecule 
possessing a specific nucleotide sequence which has been subjected to purification 
means to reduce, from the composition, the number of compounds which are 

15 normally associated with the composition. Particularly, the term refers to the 
nucleic acid molecules having the sequences set out in SEQ ID NOS: 1-391, to 
representative fragments thereof as described above, to polynucleotides at least 
95%, preferably at least 99% and especially preferably at least 99.9% identical in 
sequence thereto, also as set out above. 

20 A variety of purification means can be used to generate the isolated 

fragments of the present invention. These include, but are not limited to methods 
which separate constituents of a solution based on charge, solubility, or size. 

In one embodiment. Streptococcus pneumoniae DNA can be enzymatically 
sheared to produce fragments of 15-20 kb in length. These fragments can then be 

25 used to generate a Streptococcus pneumoniae library by inserting them into lambda 
clones as described in the Examples below. Primers flanking, for example, an 
ORF, such as those enumerated in Tables 1-3 can then be generated using 
nucleotide sequence information provided in SEQ ID NOS: 1-391. Well known 
and routine techniques of PGR cloning then can be used to isolate the ORF from 

30 the lambda DNA library or Streptococcus pneumoniae genomic DNA. Thus, given 
the availability of SEQ ID NOS: 1-391, the information in Tables 1, 2 and 3, and 
the information that may be obtained readily by analysis of the sequences of SEQ 
ID NOS: 1-391 using methods set out above, those of skill will be enabled by the 
present disclosure to isolate any ORF-containing or other nucleic acid fragment of 

35 the present invention. 
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The isolated nucleic acid molecules of the present invention include, but arc 
not limited to single stranded and double stranded DNA t and single stranded RNA. 

As used herein, an "open reading frame," ORF, means a series of triplets 
coding for amino acids without any termination codons and is a sequence 
5 translatable into protein. 

Tables 1, 2, and 3 list ORFs in the Streptococcus pneumoniae genomic 
contigs of the present invention that were identified as putative coding regions by 
the GeneMark software using organism-specific second-order Markov probability 
transition matrices. It will be appreciated that other criteria can be used, in 
10 accordance with well known analytical methods, such as those discussed herein, to 
generate more inclusive, more restrictive, or more selective lists. 

Table 1 sets out ORFs in the Streptococcus pneumoniae contigs of the 
present invention that over a continuous region of at least 50 bases are 95% or 
more identical (by BLAST analysis) to a nucleotide sequence available through 
15 GenBank in October, 1997. 

Table 2 sets out ORFs in the Streptococcus pneumoniae contigs of the 
present invention that are not in Table 1 and match, with a BLASTP probability 
score of 0.01 or less, a polypeptide sequence available through GenBank in 
October, 1997. 

20 Table 3 sets out ORFs in the Streptococcus pneumoniae contigs of the 

present invention that do not match significantly, by BLASTP analysis, a 
polypeptide sequence available through GenBank in October, 1997. 

In each table, the first and second columns identify the ORF by, 
respectively, contig number and ORF number within the contig; the third column 

25 indicates the first nucleotide of the ORF (actually the first nucleotide of the stop 
codon immediately preceeding the ORF), counting from the 5' end of the contig 
strand; and the fourth column, "stop (nt)" indicates the last nucleotide of the stop 
codon defining the 3'end of the ORF. 

In Tables 1 and 2, column five, lists the Reference for the closest 

30 matching sequence available through GenBank. These reference numbers are the 
databases entry numbers commonly used by those of skill in the art, who will be 
familiar with their denominators. Descriptions of the nomenclature are available 
from the National Center for Biotechnology Information. Column six in Tables 1 
and 2 provides the gene name of the matching sequence; column seven provides 

35 the BLAST identity score and column eight the BLAST similarity score from the 
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comparison of the ORF and the homologous gene; and column nine indicates the 
length in nucleotides of the highest scoring segment pair identified by the BLAST 
identity analysis. 

Each ORF described in the tables is defined by "start (nt)" (5') and "stop 
5 (nt)" (3 1 ) nucleotide position numbers. These position numbers refer to the 
boundaries of each ORF and provide orientation with respect to whether the 
forward or reverse strand is the coding strand and which reading frame the coding 
sequence is contained. The "start" position is the first nucleotide of the triplet 
encoding a stop codon just 5* to the ORF and the "stop" position is the last 

10 nucleotide of the triplet encoding the next in- frame stop codon (i.e., the stop codon 
at the 3* end of the ORF). Those of ordinary skill in the art appreciate that 
preferred fragments within each ORF described in the table include fragments of 
each ORF which include the entire sequence from the delineated "start" and "stop" 
positions excepting the first and last three nucleotides since these encode stop 

15 codons. Thus, polynucleotides set out as ORFs in the tables but lacking the three 
(3) 5' nucleotides and the three (3) 3' nucleotides are encompassed by the present 
invention. Those of skill also appreciate that particularly preferred are fragments 
within each ORF that are polynucleotide fragments comprising polypeptide coding 
sequence. As defined herein, "coding sequence" includes the fragment within an 

20 ORF beginning at the first in-frame ATG (triplet encoding methionine) and ending 
with the last nucleotide prior to the triplet encoding the 3' stop codon. Preferred 
are fragments comprising the entire coding sequence and fragments comprising the 
entire coding sequence, excepting the coding sequence for the N-terminal 
methionine. Those of skill appreciate that the N-terminal methionine is often 

25 removed during post-translational processing and that polynucleotides lacking the 
ATG can be used to facilitate production of N-termainal fusion proteins which may 
be benefical in the production or use of genetically engineered proteins. Of course, 
due to the degeneracy of the genetic code many polynucleotides can encode a given 
polypeptide. Thus, the invention further includes polynucleotides comprising a 

30 nucleotide sequence encoding a polypeptide sequence itself encoded by the coding 
sequence within an ORF described in Tables 1-3 herein. Further, polynucleotides 
at least 95%, preferably at least 99% and especially preferably at least 99.9% 
identical in sequence to the foregoing polynucleotides, are contemplated by the 
present invention. 
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Polypeptides encoded by polynucleotides described above and elsewhere 
herein are also provided by the present invention as are polypeptide comprising a 
an amino acid sequence at least about 95%, preferably at least 97% and even more 
preferably 99% identical to the amino acid sequence of a polypeptide encoded by an 
5 ORF shown in Tables 1-3. These polypeptides may or may not comprise an N- 
terminal methionine. 

The concepts of percent identity and percent similarity of two polypeptide 
sequences is well understood in the art. For example, two polypeptides 10 amino 
acids in length which differ at three amino acid positions (e.g., at positions 1, 3 

10 and 5) are said to have a percent identity of 70%. However, the same two 
polypeptides would be deemed to have a percent similarity of 80% if, for example 
at position 5, the amino acids moieties, although not identical, were "similar" (i.e., 
possessed similar biochemical characteristics). Many programs for analysis of 
nucleotide or amino acid sequence similarity, such as fasta and BLAST specifically 

15 list percent identity of a matching region as an output parameter. Thus, for 
instance, Tables 1 and 2 herein enumerate the percent identity of the highest 
scoring segment pair in each ORF and its listed relative. Further details 
concerning the algorithms and criteria used for homology searches are provided 
below and are described in the pertinent literature highlighted by the citations 

20 provided below. 

It will be appreciated that other criteria can be used to generate more 
inclusive and more exclusive listings of the types set out in the tables. As those of 
skill will appreciate, narrow and broad searches both are useful. Thus, a skilled 
artisan can readily identify ORFs in contigs of the Streptococcus pneumoniae 

25 genome other than those listed in Tables 1-3, such as ORFs which are overlapping 
or encoded by the opposite strand of an identified ORF in addition to those 
ascertainable using the computer-based systems of the present invention. 

As used herein, an "expression modulating fragment," EMF, means a 
series of nucleotide molecules which modulates the expression of an operably 

30 linked ORF or EMF. 



WO 98/18931 



19 



PCTAJS97/19588 



As used herein, a sequence is said to "modulate the expression of an 
operably linked sequence" when the expression of the sequence is altered by the 
presence of the EMF. EMFs include, but are not limited to, promoters, and 
promoter modulating sequences (inducible elements). One class of EMFs are 
fragments which induce the expression or an operably linked ORF in response to a 
specific regulatory factor or physiological event. 

EMF sequences can be identified within the contigs of the Streptococcus 
pneumoniae genome by their proximity to the ORFs provided in Tables 1-3. An 
intergenic segment, or a fragment of the intergenic segment, from about 10 to 200 
nucleotides in length, taken from any one of the ORFs of Tables 1-3 will modulate 
the expression of an operably linked ORF in a fashion similar to that found with the 
naturally linked ORF sequence. As used herein, an "intergenic segment" refers to 
fragments of the Streptococcus pneumoniae genome which are between two 
ORF(s) herein described. EMFs also can be identified using known EMFs as a 
target sequence or target motif in the computer-based systems of the present 
invention. Further, the two methods can be combined and used together. 

The presence and activity of an EMF can be confirmed using an EMF trap 
vector. An EMF trap vector contains a cloning site linked to a marker sequence. A 
marker sequence encodes an identifiable phenotype, such as antibiotic resistance or 
a complementing nutrition auxotrophic factor, which can be identified or assayed 
when the EMF trap vector is placed within an appropriate host under appropriate 
conditions. As described above, a EMF will modulate the expression of an 
operably linked marker sequence. A more detailed discussion of various marker 
sequences is provided below. A sequence which is suspected as being an EMF is 
cloned in all three reading frames in one or more restriction sites upstream from the 
marker sequence in the EMF trap vector. The vector is then transformed into an 
appropriate host using known procedures and the phenotype of the transformed 
host in examined under appropriate conditions. As described above, an EMF will 
modulate the expression of an operably linked marker sequence. 

As used herein, a "diagnostic fragment," DF, means a series of nucleotide 
molecules which selectively hybridize to Streptococcus pneumoniae sequences. 
DFs can be readily identified by identifying unique sequences within contigs of the 
Streptococcus pneumoniae genome, such as by using well-known computer 
analysis software, and by generating and testing probes or amplification primers 
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consisting of the DF sequence in an appropriate diagnostic format which 
determines amplification or hybridization selectivity. 

The sequences falling within the scope of the present invention are not 
limited to the specific sequences herein described, but also include allelic and 
5 species variations thereof. Allelic and species variations can be routinely 
determined by comparing the sequences provided in SEQ ED NOS: 1-391, a 
representative fragment thereof, or a nucleotide sequence at least 95%, preferably 
at least 99% and most at least preferably 99.9% identical to SEQ ID NOS: 1-391, 
with a sequence from another isolate of the same species. Furthermore, to 

10 accommodate codon variability, the invention includes nucleic acid molecules 
coding for the same amino acid sequences as do the specific ORFs disclosed 
herein. In other words, in the coding region of an ORF, substitution of one codon 
for another which encodes the same amino acid is expressly contemplated. Any 
specific sequence disclosed herein can be readily screened for errors by 

15 resequencing a particular fragment, such as an ORF, in both directions (i.e., 
sequence both strands). Alternatively, error screening can be performed by 
sequencing corresponding polynucleotides of Streptococcus pneumoniae origin 
isolated by using part or all of the fragments in question as a probe or primer. 

Preferred DFs of the present invention comprise at least about 17, 

20 preferrably at least about 20, and more preferrably at least about 50 contiguous 
nucleotides within an ORF set out in Tables 1-3. Most highly preferred DFs 
specifically hybridize to a polynucleotide containing the sequence of the ORF from 
which they are derived. Specific hybridization occurs even under stringent 
conditions defined elsewhere herein. 

25 Each of the ORFs of the Streptococcus pneumoniae genome disclosed in 

Tables 1, 2 and 3, and the EMFs found 5' to the ORFs, can be used as 
polynucleotide reagents in numerous ways. For example, the sequences can be 
used as diagnostic probes or diagnostic amplification primers to detect the presence 
of a specific microbe in a sample, particularly Streptococcus pneumoniae. 

30 Especially preferred in this regard are ORFs such as those of Table 3, which do not 
match previously characterized sequences from other organisms and thus are most 
likely to be highly selective for Streptococcus pneumoniae. Also particularly 
preferred are ORFs that can be used to distinguish between strains of Streptococcus 
pneumoniae, particularly those that distinguish medically important strain, such as 

35 drug-resistant strains. 
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In addition, the fragments of the present invention, as broadly described, 
can be used to control gene expression through triple helix formation or antisense 
DNA or RNA, both of which methods are based on the binding of a polynucleotide 
sequence to DNA or RNA. Triple helix-formation optimally results in a shut-off of 
5 RNA transcription from DNA, while antisense RNA hybridization blocks 
translation of an mRNA molecule into polypeptide. Information from the 
sequences of the present invention can be used to design antisense and triple helix- 
forming oligonucleotides. Polynucleotides suitable for use in these methods are 
usually 20 to 40 bases in length and are designed to be complementary to a region 
10 of the gene involved in transcription, for triple-helix formation, or to the mRNA 
itself, for antisense inhibition. Both techniques have been demonstrated to be 
effective in model systems, and the requisite techniques are well known and 
involve routine procedures. Triple helix techniques arc discussed in, for example, 
Lee et ai, Nucl. Acids Res. (5:3073 (1979); Cooney et a/., Science 241:456 
15 (1988); and Dervan et al., Science 257:1360 (1991). Antisense techniques in 
general are discussed in, for instance, Okano, J. Neurochem. 56:560 (1991) and 
Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, 
Boca Raton, FL (1988)). 

The present invention further provides recombinant constructs comprising 
20 one or more fragments of the Streptococcus pneumoniae genomic fragments and 
contigs of the present invention. Certain preferred recombinant constructs of the 
present invention comprise a vector, such as a plasmid or viral vector, into which a 
fragment of the Streptococcus pneumoniae genome has been inserted, in a forward 
or reverse orientation. In the case of a vector comprising one of the ORFs of the 
25 present invention, the vector may further comprise regulatory sequences, including 
for example, a promoter, operably linked to the ORF. For vectors comprising the 
EMFs of the present invention, the vector may further comprise a marker sequence 
or heterologous ORF operably linked to the EMF. 

Large numbers of suitable vectors and promoters are known to those of 
30 skill in the art and are commercially available for generating the recombinant 
constructs of the present invention. The following vectors are provided by way of 
example. Useful bacterial vectors include phagescript, PsiX174, pBluescript SK, 
pBS KS, pNH8a, pNH16a, pNH18a, pNH46a (available from Stratagene); 
P Trc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (available from Pharmacia). 
35 Useful eukaryotic vectors include pWLneo, pSV2cat, pOG44, pXTl, pSG 
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(available from Stratagene) pSVK3, pBPV, pMSG, pSVL (available from 
Pharmacia). 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. 
5 Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial 
promoters include lad, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic 
promoters include CMV immediate early, HSV thymidine kinase, early and late 
SV40, LTRs from retrovirus, and mouse metallothionein- I. Selection of the 
appropriate vector and promoter is well within the level of ordinary skill in the art. 

10 The present invention further provides host cells containing any one of the 

isolated fragments of the Streptococcus pneumoniae genomic fragments and 
contigs of the present invention, wherein the fragment has been introduced into the 
- host cell using known methods. The host cell can be a higher eukaryotic host 
cell, such as a mammalian ceil, a lower eukaryotic host cell, such as a yeast cell, or 

1 5 a procaryotic cell, such as a bacterial cell. 

A polynucleotide of the present invention, such as a recombinant construct 
comprising an ORF of the present invention, may be introduced into the host by a 
variety of well established techniques that are standard in the art, such as calcium 
phosphate transfection, DEAE, dextran mediated transfection and electroporation, 

20 which are described in, for instance, Davis, L. et ai, BASIC METHODS IN 
MOLECULAR BIOLOGY (1986). 

A host cell containing one of the fragments of the Streptococcus 
pneumoniae genomic fragments and contigs of the present invention, can be used 
in conventional manners to produce the gene product encoded by the isolated 

25 fragment (in the case of an ORF) or can be used to produce a heterologous protein 
under the control of the EMF. The present invention further provides 

isolated polypeptides encoded by the nucleic acid fragments of the present 
invention or by degenerate variants of the nucleic acid fragments of the present 
invention. By "degenerate variant' 1 is intended nucleotide fragments which differ 

30 from a nucleic acid fragment of the present invention {e.g., an ORF) by nucleotide 
sequence but, due to the degeneracy of the Genetic Code, encode an identical 
polypeptide sequence. 

Preferred nucleic acid fragments of the present invention are the ORFs and 
subfragments thereof depicted in Tables 2 and 3 which encode proteins. 
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A variety of methodologies known in the art can be utilized to obtain any 
one of the isolated polypeptides or proteins of the present invention. At the 
simplest level, the amino acid sequence can be synthesized using commercially 
available peptide synthesizers. This is particularly useful in producing small 
5 peptides and fragments of larger polypeptides. Such short fragments as may be 
obtained most readily by synthesis arc useful, for example, in generating antibodies 
against the native polypeptide, as discussed further below. 

In an alternative method, the polypeptide or protein is purified from 
bacterial cells which naturally produce the polypeptide or protein. One skilled in 

10 the art can readily employ well-known methods for isolating polypeptides and 
proteins to isolate and purify polypeptides or proteins of the present invention 
produced naturally by a bacterial strain, or by other methods. Methods for 
isolation and purification that can be employed in this regard include, but are not 
limited to, immunochromatography, HPLC, size-exclusion chromatography, ion- 

15 exchange chromatography, and immuno-affinity chromatography. 

The polypeptides and proteins of the present invention also can be purified 
from cells which have been altered to express the desired polypeptide or protein. 
As used herein, a cell is said to be altered to express a desired polypeptide or 
protein when the cell, through genetic manipulation, is made to produce a 

20 polypeptide or protein which it normally does not produce or which the cell 
normally produces at a lower level. Those skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic 
sequences into eukaryotic or prokaryotic cells in order to generate a cell which 
produces one of the polypeptides or proteins of the present invention. 

25 Any host/vector system can be used to express one or more of the ORFs of 

the present invention. These include, but are not limited to, eukaryotic hosts such 
as HeLa cells, CV-1 cell, COS cells, and Sf9 cells, as well as prokaryotic host 
such as E. coli and 2?. subtilis. The most preferred cells are those which do not 
normally express the particular polypeptide or protein or which expresses the 

30 polypeptide or protein at low natural level. 
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"Recombinant/ as used herein, means that a polypeptide or protein is 
derived from recombinant {e.g., microbial or mammalian) expression systems. 
"Microbial" refers to recombinant polypeptides or proteins made in bacterial or 
fungal (e.g., yeast) expression systems. As a product, "recombinant 

5 microbiaTdefines a polypeptide or protein essentially free of native endogenous 
substances and unaccompanied by associated native glycosylation. Polypeptides or 
proteins expressed in most bacterial cultures, e.g., £. coli, will be free of 
glycosylation modifications; polypeptides or proteins expressed in yeast will have a 
glycosylation pattern different from that expressed in mammalian cells. 

10 "Nucleotide sequence" refers to a heteropolymer of deoxy ribonucleotides. 

Generally, DNA segments encoding the polypeptides and proteins provided by this 
invention are assembled from fragments of the Streptococcus pneumoniae genome 
and short oligonucleotide linkers, or from a series of oligonucleotides, to provide a 
synthetic gene which is capable of being expressed in a recombinant transcriptional 

1 5 unit comprising regulatory elements derived from a microbial or viral operon. 

Recombinant expression vehicle or vector" refers to a plasmid or phage or 
virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. The 
expression vehicle can comprise a transcriptional unit comprising an assembly of 
(1) a genetic regulatory elements necessary for gene expression in the host, 

20 including elements required to initiate and maintain transcription at a level sufficient 
for suitable expression of the desired polypeptide, including, for example, 
promoters and, where necessary, an enhancer and a polyadenylation signal; (2) a 
structural or coding sequence which is transcribed into mRNA and translated into 
protein, and (3) appropriate signals to initiate translation at the beginning of the 

25 desired coding region and terminate translation at its end. Structural units intended 
for use in yeast or eukaryotic expression systems preferably include a leader 
sequence enabling extracellular secretion of translated protein by a host cell. 
Alternatively, where recombinant protein is expressed without a leader or transport 
sequence, it may include an N-terminal methionine residue. This residue may or 

30 may not be subsequently cleaved from the expressed recombinant protein to 
provide a final product. 

"Recombinant expression system" means host cells which have stably 
integrated a recombinant transcriptional unit into chromosomal DNA or carry the 
recombinant transcriptional unit extra chromosomally. The cells can be prokaryotic 

35 or eukaryotic. Recombinant expression systems as defined herein will express 
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heterologous polypeptides or proteins upon induction of the regulatory elements 
linked to the DNA segment or synthetic gene to be expressed. 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or 
other cells under the control of appropriate promoters. Cell-free translation 
5 systems can also be employed to produce such proteins using RNAs derived from 
the DNA constructs of the present invention. Appropriate cloning and expression 
vectors for use with prokaryotic and eukaryotic hosts are described in Sambrook et 
al, Molecular Cloning: A Laboratory Manual 2 nd Edition, Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, New York (1989), the disclosure of which 
10 is hereby incorporated by reference in its entirety. 

Generally, recombinant expression vectors will include origins of 
replication and selectable markers permitting transformation of the host cell, e.g., 
the ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and a 
promoter derived from a highly expressed gene to direct transcription of a 
15 downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3- phosphoglycerate kinase (PGK), alpha- 
factor, acid phosphatase, or heat shock proteins, among others. The heterologous 
structural sequence is assembled in appropriate phase with translation initiation and 
termination sequences, and preferably, a leader sequence capable of directing 
20 secretion of translated protein into the periplasmic space or extracellular medium. 
Optionally, the heterologous sequence can encode a fusion protein including an N- 
terminal identification peptide imparting desired characteristics, e.g., stabilization 
or simplified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting a 
25 structural DNA sequence encoding a desired protein together with suitable 
translation initiation and termination signals in operable reading phase with a 
functional promoter. The vector will comprise one or more phenotypic selectable 
markers and an origin of replication to ensure maintenance of the vector and, when 
desirable, provide amplification within the host. 
30 Suitable prokaryotic hosts for transformation include strains of E. coli, B. 

subtilis, Salmonella typhimurium and various species within the genera 
Pseudomonas and Streptomyces. Others may, also be employed as a matter of 
choice. 

As a representative but non-limiting example, useful expression vectors for 
35 bacterial use can comprise a selectable marker and bacterial origin of replication 
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derived from commercially available plasmids comprising genetic elements of the 
well known cloning vector pBR322 (ATCC 37017). Such commercial vectors 
include, for example, pKK223-3 (available form Pharmacia Fine Chemicals, 
Uppsala, Sweden) and GEM 1 (available from Promega Biotec, Madison, WI, 
5 USA). These pBR322 "backbone" sections are combined with an appropriate 
promoter and the structural sequence to be expressed. 

Following transformation of a suitable host strain and growth of the host 
strain to an appropriate cell density, the selected promoter, where it is inducible, is 
derepressed or induced by appropriate means (e.g., temperature shift or chemical 
10 induction) and cells are cultured for an additional period to provide for expression 
of the induced gene product. Thereafter cells are typically harvested, generally by 
centrifugation, disrupted to release expressed protein, generally by physical or 
chemical means, and the resulting crude extract is retained for further purification. 
Various mammalian cell culture systems can also be employed to express 
15 recombinant protein. Examples of mammalian expression systems include the 
COS-7 lines of monkey kidney fibroblasts, described in Gluzman, Cell 25:175 
( 198 1 ), and other cell lines capable of expressing a compatible vector, for example, 
the C127, 3T3, CHO, HeLa and BHK cell lines. 

Mammalian expression vectors will comprise an origin of replication, a 
20 suitable promoter and enhancer, and also any necessary ribosome binding sites, 
polyadenylation site, splice donor and acceptor sites, transcriptional termination 
sequences, and 5* flanking nontranscribed sequences. DNA sequences derived 
from the SV40 viral genome, for example, S V40 origin, early promoter, enhancer, 
splice, and polyadenylation sites may be used to provide the required 
25 nontranscribed genetic elements. 

Recombinant polypeptides and proteins produced in bacterial culture is 
usually isolated by initial extraction from cell pellets, followed by one or more 
salting-out, aqueous ion exchange or size exclusion chromatography steps. 
Microbial cells employed in expression of proteins can be disrupted by any 
30 convenient method, including freeze-thaw cycling, sonication, mechanical 
disruption, or use of cell lysing agents. Protein refolding steps can be used, as 
necessary, in completing configuration of the mature protein. Finally, high 
performance liquid chromatography (HPLC) can be employed for Final purification 
steps. 
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The present invention further includes isolated polypeptides, proteins and 
nucleic acid molecules which are substantially equivalent to those herein described. 
As used herein, substantially equivalent can refer both to nucleic acid and amino 
acid sequences, for example a mutant sequence, that varies from a reference 

5 sequence by one or more substitutions, deletions, or additions, the net effect of 
which does not result in an adverse functional dissimilarity between reference and 
subject sequences. For purposes of the present invention, sequences having 
equivalent biological activity, and equivalent expression characteristics are 
considered substantially equivalent. For purposes of determining equivalence, 

10 truncation of the mature sequence should be disregarded. 

The invention further provides methods of obtaining homologs from other 
strains of Streptococcus pneumoniae, of the fragments of the Streptococcus 
pneumoniae genome of the present invention and homologs of the proteins encoded 
by the ORFs of the present invention. As used herein, a sequence or protein of 

15 Streptococcus pneumoniae is defined as a homolog of a fragment of the 
Streptococcus pneumoniae fragments or contigs or a protein encoded by one of the 
ORFs of the present invention, if it shares significant homology to one of the 
fragments of the Streptococcus pneumoniae genome of the present invention or a 
protein encoded by one of the ORFs of the present invention. Specifically, by 

20 using the sequence disclosed herein as a probe or as primers, and techniques such 
as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain 
homologs. 

As used herein, two nucleic acid molecules or proteins are said to "share 
significant homology" if the two contain regions which possess greater than 85% 

25 sequence (amino acid or nucleic acid) homology. Preferred homologs in this 
regard are those with more than 90% homology. Especially preferred are those 
with 93% or more homology. Among especially preferred homologs those with 
95% or more homology are particularly preferred. Very particularly preferred 
among these are those with 97% and even more particularly preferred among those 

30 are homologs with 99% or more homology. The most preferred homologs among 
these are those with 99.9% homology or more. It will be understood that, among 
measures of homology, identity is particularly preferred in this regard. 

Region specific primers or probes derived from the nucleotide sequence 
provided in SEQ ID NOS:l-39l or from a nucleotide sequence at least 95%, 

35 particularly at least 99%, especially at least 99.5% identical to a sequence of SEQ 
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ID NOS: 1-391 can be used to prime DNA synthesis and PCR amplification, as 
well as to identify colonies containing cloned DNA encoding a homolog. Methods 
suitable to this aspect of the present invention are well known and have been 
described in great detail in many publications such as, for example, Innis et aL, 

5 PCR Protocols, Academic Press, San Diego, CA ( 1 990)). 

When using primers derived from SEQ ID NOS: 1-391 or from a nucleotide 
sequence having an aforementioned identity to a sequence of SEQ ID NOS: 1-391, 
one skilled in the art will recognize that by employing high stringency conditions 
(e.g., annealing at 50-60°C in 6X SSPC and 50% formamide, and washing at 50- 

10 65°C in 0.5X SSPC) only sequences which are greater than 75% homologous to 
the primer will be amplified. By employing lower stringency conditions (e.g., 
hybridizing at 35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C 
in 0.5X SSPC), sequences which are greater than 40-50% homologous to the 
primer will also be amplified. 

15 When using DNA probes derived from SEQ ED NOS: 1-391, or from a 

nucleotide sequence having an aforementioned identity to a sequence of SEQ ID 
NOS: 1-391, for colony/plaque hybridization, one skilled in the art will recognize 
that by employing high stringency conditions (e.g., hybridizing at 50- 65°C in 5X 
SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC), sequences 

20 having regions which are greater than 90% homologous to the probe can be 
obtained, and that by employing lower stringency conditions (e.g., hybridizing at 
35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C in 0.5X 
SSPC), sequences having regions which are greater than 35-45% homologous to 
the probe will be obtained. 

25 Any organism can be used as the source for homologs of the present 

invention so long as the organism naturally expresses such a protein or contains 
genes encoding the same. The most preferred organism for isolating homologs are 
bacteria which are closely related to Streptococcus pneumoniae. 

30 ILLUSTRATIVE USES OF COMPOSITIONS OF THE 

INVENTION 

Each ORF provided in Tables 1 and 2 is identified with a function by 
homology to a known gene or polypeptide. As a result, one skilled in the art can 
use the polypeptides of the present invention for commercial, therapeutic and 
35 industrial purposes consistent with the type of putative identification of the 
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polypeptide. Such identifications permit one skilled in the art to use the 
Streptococcus pneumoniae ORFs in a manner similar to the known type of 
sequences for which the identification is made; for example, to ferment a particular 
sugar source or to produce a particular metabolite. A variety of reviews illustrative 

5 of this aspect of the invention are available, including the following reviews on the 
industrial use of enzymes, for example, BIOCHEMICAL ENGINEERING AND 
BIOTECHNOLOGY HANDBOOK, 2nd Ed., MacMillan Publications, Ltd. NY 
(1991) and BIOCATALYSTS IN ORGANIC SYNTHESES, Tramper ex al, Eds., 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). A variety of 

10 exemplary uses that illustrate this and similar aspects of the present invention are 
discussed below. 

1. Biosynthetic Enzymes 

Open reading frames encoding proteins involved in mediating the catalytic 

15 reactions involved in intermediary and macromolecular metabolism, the 
biosynthesis of small molecules, cellular processes and other functions includes 
enzymes involved in the degradation of the intermediary products of metabolism, 
enzymes involved in central intermediary metabolism, enzymes involved in 
respiration, both aerobic and anaerobic, enzymes involved in fermentation, 

20 enzymes involved in ATP proton motor force conversion, enzymes involved in 
broad regulatory function, enzymes involved in amino acid synthesis, enzymes 
involved in nucleotide synthesis, enzymes involved in cofactor and vitamin 
synthesis, can be used for industrial biosynthesis. 

The various metabolic pathways present in Streptococcus pneumoniae can 

25 be identified based on absolute nutritional requirements as well as by examining the 
various enzymes identified in Table 1-3 and SEQ ID NOS: 1-391. 

Of particular interest are polypeptides involved in the degradation of 
intermediary metabolites as well as non-macromolecular metabolism. Such 
enzymes include amylases, glucose oxidases, and catalase. 

30 Proteolytic enzymes are another class of commercially important enzymes. 

Proteolytic enzymes find use in a number of industrial processes including the 
processing of flax and other vegetable fibers, in the extraction, clarification and 
depectinization of fruit juices, in the extraction of vegetables' oil and in the 
maceration of fruits and vegetables to give unicellular fruits. A detailed review of 

35 the proteolytic enzymes, used in the food industry is provided in Rombouts ex ai, 
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Symbiosis 21:19 (1986) and Voragen et al in Biocatalysts In Agricultural 
Biotechnology, Whitaker et al., Eds., American Chemical Society Symposium 
Series 389:93 (1989) . 

The metabolism of sugars is an important aspect of the primary metabolism 

5 of Streptococcus pneumoniae. Enzymes involved in the degradation of sugars, 
such as, particularly, glucose, galactose, fructose and xylose, can be used in 
industrial fermentation. Some of the important sugar transforming enzymes, from 
a commercial viewpoint, include sugar isomerases such as glucose isomerase. 
Other metabolic enzymes have found commercial use such as glucose oxidases 

10 which produces ketogulonic acid (KGA). KG A is an intermediate in the 
commercial production of ascorbic acid using the Reichstein's procedure, as 
described in Krueger et al, Biotechnology 6(A) . Rhine et aL, Eds., Verlag Press, 
Weinheim, Germany (1984). 

Glucose oxidase (GOD) is commercially available and has been used in 

15 purified form as well as in an immobilized form for the deoxygenation of beer. 
See, for instance, Hartmeir et aL, Biotechnology Letters 7:21 (1979). The most 
important application of GOD is the industrial scale fermentation of gluconic acid. 
Market for gluconic acids which are used in the detergent, textile, leather, 
photographic, pharmaceutical, food, feed and concrete industry, as described, for 

20 example, in Bigelis et aL, beginning on page 357 in GENE MANIPULATIONS 
AND FUNGI; Benett et aL, Eds., Academic Press, New York (1985). In addition 
to industrial applications, GOD has found applications in medicine for quantitative 
determination of glucose in body fluids recently in biotechnology for analyzing 
syrups from starch and cellulose hydrosylates. This application is described in 

25 Owusu et aL, Biochem. et Biophysica. Acta. 572:83 ( 1986), for instance. 

The main sweetener used in the world today is sugar which comes from 
sugar beets and sugar cane. In the field of industrial enzymes, the glucose 
isomerase process shows the largest expansion in the market today. Initially, 
soluble enzymes were used and later immobilized enzymes were developed 

30 (Krueger et aL, Biotechnology, The Textbook of Industrial Microbiology, Sinauer 
Associated Incorporated, Sunderland, Massachusetts (1990)). Today, the use of 
glucose- produced high fructose syrups is by far the largest industrial business 
using immobilized enzymes. A review of the industrial use of these enzymes is 
provided by Jorgensen, Starch 40:307 (1988). 
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Proteinases, such as alkaline serine proteinases, are used as detergent 
additives and thus represent one of the largest volumes of microbial enzymes used 
in the industrial sector. Because of their industrial importance, there is a large body 
of published and unpublished information regarding the use of these enzymes in 
5 industrial processes. (See Faultman et a/., Acid Proteases Structure Function and 
Biology, Tang, J M ed., Plenum Press, New York (1977) and Godfrey et ai % 
Industrial Enzymes, MacMillan Publishers, Surrey, UK (1983) and Hepner et a/., 
Report Industrial Enzymes by 1990, Hel Hepner & Associates, London ( 1986)). 

Another class of commercially usable proteins of the present invention are 

10 the microbial lipases, described by, for instance, Macrae et al, Philosophical 
Transactions of the Chiral Society of London 310:221 (1985) and Poserke, Journal 
of the American Oil Chemist Society 61: 1758 (1984). A major use of lipases is in 
the fat and oil industry for the production of neutral glycerides using lipase 
catalyzed inter-esterification of readily available triglycerides. Application of 

15 lipases include the use as a detergent additive to facilitate the removal of fats from 
fabrics in the course of the washing procedures. 

The use of enzymes, and in particular microbial enzymes, as catalyst for 
key steps in the synthesis of complex organic molecules is gaining popularity at a 
great rate. One area of great interest is the preparation of chiral intermediates. 

20 Preparation of chiral intermediates is of interest to a wide range of synthetic 
chemists particularly those scientists involved with the preparation of new 
pharmaceuticals, agrochemicals, fragrances and flavors. (See Davies et al. % Recent 
Advances in the Generation of Chiral Intermediates Using Enzymes, CRC Press, 
Boca Raton, Florida (1990)). The following reactions catalyzed by enzymes are of 

25 interest to organic chemists: hydrolysis of carboxylic acid esters, phosphate esters, 
amides and nitriles, esterification reactions, trans-esterification reactions, synthesis 
of amides, reduction of alkanones and oxoalkanates, oxidation of alcohols to 
carbonyl compounds, oxidation of sulfides to sulfoxides, and carbon bond forming 
reactions such as the aldol reaction. 

30 When considering the use of an enzyme encoded by one of the ORFs of the 

present invention for biotransformation and organic synthesis it is sometimes 
necessary to consider the respective advantages and disadvantages of using a 
microorganism as opposed to an isolated enzyme. Pros and cons of using a whole 
cell system on the one hand or an isolated partially purified enzyme on the other 
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hand, has been described in detail by Bud et a/., Chemistry in Britain (1987), p. 
127. 

Amino transferases, enzymes involved in the biosynthesis and metabolism 
of amino acids, are useful in the catalytic production of amino acids. The 
5 advantages of using microbial based enzyme systems is that the amino transferase 
enzymes catalyze the stereo- selective synthesis of only L-amino acids and 
generally possess uniformly high catalytic rates. A description of the use of amino 
transferases for amino acid production is provided by Roselle-David, Methods of 
Enzymology 756:479 (1987). 
10 Another category of useful proteins encoded by the ORFs of the present 

invention include enzymes involved in nucleic acid synthesis, repair, and 
recombination. 

2. Generation of Antibodies 

i5 As described here, the proteins of the present invention, as well as 

homologs thereof, can be used in a variety of procedures and methods known in 
the art which are currently applied to other proteins. The proteins of the present 
invention can further be used to generate an antibody which selectively binds the 
protein. Such antibodies can be either monoclonal or polyclonal antibodies, as well 

20 fragments of these antibodies, and humanized forms. 

The invention further provides antibodies which selectively bind to one of 
the proteins of the present invention and hybridomas which produce these 
antibodies. A hybridoma is an immortalized cell line which is capable of secreting 
a specific monoclonal antibody. 

25 In general, techniques for preparing polyclonal and monoclonal antibodies 

as well as hybridomas capable of producing the desired antibody are well known in 
the art (Campbell, A. M., Monoclonal Antibody Technology: Laboratory 
Techniques In Biochemistry And Molecular Biology, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1984); St. Groth etal, 7. Immunol Methods 35: 1- 

30 21 (1980), Kohler and Milstein, Nature 256:495-497 (1975)), the trioma 
technique, the human B-cell hybridoma technique (Kozbor et al., Immunology 
Today 4:12 (1983), pgs. 77-96 of Cole et ai y in Monoclonal Antibodies And 
Cancer Therapy, Alan R. Liss, Inc. (1985)). Any animal (mouse, rabbit, 

etc.) which is known to produce antibodies can be immunized with the pseudogene 

35 polypeptide. Methods for immunization are well known in the art. Such methods 
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include subcutaneous or interperitoneal injection of the polypeptide. One skilled in 
the art will recognize that the amount of the protein encoded by the ORF of the 
present invention used for immunization will vary based on the animal which is 
immunized, the antigenicity of the peptide and the site of injection. 

5 The protein which is used as an immunogen may be modified or 

administered in an adjuvant in order to increase the protein's antigenicity. Methods 
of increasing the antigenicity of a protein are well known in the art and include, but 
are not limited to coupling the antigen with a heterologous protein (such as globulin 
or galactosidase) or through the inclusion of an adjuvant during immunization. 

10 For monoclonal antibodies, spleen cells from the immunized animals are 

removed, fused with myeloma cells, such as SP2/0-Agl4 myeloma cells, and 
allowed to become monoclonal antibody producing hybridoma cells. 

Any one of a number of methods well known in the art can be used to 
identify the hybridoma cell which produces an antibody with the desired 

15 characteristics. These include screening the hybridomas with an ELISA assay, 
western blot analysis, or radioimmunoassay (Lutz et al, Exp. Cell Res. 775:109- 
124 (1988)). 

Hybridomas secreting the desired antibodies are cloned and the class and 
subclass is determined using procedures known in the art (Campbell, A. M., 
20 Monoclonal Antibody Technology: Laboratory Techniques in Biochemistry and 
Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands 
(1984)). 

Techniques described for the production of single chain antibodies (U. S. 
Patent 4,946,778) can be adapted to produce single chain antibodies to proteins of 
25 the present invention. 

For polyclonal antibodies, antibody containing antisera is isolated from the 
immunized animal and is screened for the presence of antibodies with the desired 
specificity using one of the above-described procedures. 

The present invention further provides the above- described antibodies in 
30 detectably labelled form. Antibodies can be detectably labelled through the use of 
radioisotopes, affinity labels (such as biotin, avidin, etc.), enzymatic labels (such 
as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as 
FITC or rhodamine, etc.), paramagnetic atoms, etc. Procedures for accomplishing 
such labeling are well-known in the art, for example see Sternberger et al. y J. 
35 Histochem. Cytochem. 78:315 (1970); Bayer, E. A. et al„ Meth. Enzym. 62:308 
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(1979); Engval, E. et ai, Immunol. 109:129 (1972); Goding, J. W., J. Immunol. 
Meth. 73:215(1976)). 

The labeled antibodies of the present invention can be used for in vitro, in 
vivo, and in situ assays to identify cells or tissues in which a fragment of the 

5 Streptococcus pneumoniae genome is expressed. 

The present invention further provides the above-described antibodies 
immobilized on a solid support. Examples of such solid supports include plastics 
such as polycarbonate, complex carbohydrates such as agarose and sepharose, 
acrylic resins and such as polyacrylamide and latex beads. Techniques for 

10 coupling antibodies to such solid supports are well known in the art (Weir, D. M. 
et ai, "Handbook of Experimental Immunology' 1 4th Ed., Blackwell Scientific 
Publications, Oxford, England, Chapter 10 (1986); Jacoby, W. D. et ai, Meth. 
Enzym. 34 Academic Press, N. Y. (1974)). The immobilized antibodies of the 
present invention can be used for in vitro, in vivo, and in situ assays as well as for 

15 immunoaffinity purification of the proteins of the present invention. 

3. Diagnostic Assays and Kits 

The present invention further provides methods to identify the expression 
of one of the ORFs of the present invention, or homolog thereof, in a test sample, 
20 using one of the DFs or antibodies of the present invention. 

In detail, such methods comprise incubating a test sample with one or more 
of the antibodies or one or more of the DFs of the present invention and assaying 
for binding of the DFs or antibodies to components within the test sample. 

Conditions for incubating a DF or antibody with a test sample vary. 
25 Incubation conditions depend on the format employed in the assay, the detection 
methods employed, and the type and nature of the DF or antibody used in the 
assay. One skilled in the art will recognize that any one of the commonly available 
hybridization, amplification or immunological assay formats can readily be adapted 
to employ the DFs or antibodies of the present invention. Examples of such assays 
30 can be found in Chard, T., An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); 
Bullock, G. R. et ai, Techniques in Immunocytochemistry, Academic Press, 
Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice and 
Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry and 
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Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands 
(1985). 

The test samples of the present invention include cells, protein or membrane 
extracts of cells, or biological fluids such as sputum, blood, serum, plasma, or 
5 urine. The test sample used in the above-described method will vary based on the 
assay format, nature of the detection method and the tissues, cells or extracts used 
as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to 
obtain a sample which is compatible with the system utilized. 

10 In another embodiment of the present invention, kits are provided which 

contain the necessary reagents to carry out the assays of the present invention. 

Specifically, the invention provides a compartmentalized kit to receive, in 
close confinement, one or more containers which comprises: (a) a first container 
comprising one of the DFs or antibodies of the present invention; and (b) one or 

15 more other containers comprising one or more of the following: wash reagents, 
reagents capable of detecting presence of a bound DF or antibody. 

In detail, a compartmentalized kit includes any kit in which reagents are 
contained in separate containers. Such containers include small glass containers, 
plastic containers or strips of plastic or paper. Such containers allows one to 

20 efficiently transfer reagents from one compartment to another compartment such 
that the samples and reagents are not cross-contaminated, and the agents or 
solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept 
the test sample, a container which contains the antibodies used in the assay, 

25 containers which contain wash reagents (such as phosphate buffered saline, Tris- 
buffers, etc.), and containers which contain the reagents used to detect the bound 
antibody or DF. 

Types of detection reagents include labelled nucleic acid probes, labelled 
secondary antibodies, or in the alternative, if the primary antibody is labelled, the 
30 enzymatic, or antibody binding reagents which are capable of reacting with the 
labelled antibody. One skilled in the art will readily recognize that the disclosed 
DFs and antibodies of the present invention can be readily incorporated into one of 
the established kit formats which are well known in the art. 

35 4. Screening. Assay for Binding Agents 
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Using the isolated proteins of the present invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a 
protein encoded by one of the ORFs of the present invention or to one of the 
fragments and the Streptococcus pneumoniae fragment and contigs herein 
5 described. 

In general, such methods comprise steps of; 

(a) contacting an agent with an isolated protein encoded by one of the 
ORFs of the present invention, or an isolated fragment of the Streptococcus 
pneumoniae genome; and 
10 (b) determining whether the agent binds to said protein or said fragment. 

The agents screened in the above assay can be, but are not limited to, 
peptides, carbohydrates, vitamin derivatives, or other pharmaceutical agents. The 
agents can be selected and screened at random or rationally selected or designed 
using protein modeling techniques. 
15 For random screening, agents such as peptides, carbohydrates, 

pharmaceutical agents and the like are selected at random and are assayed for their 
ability to bind to the protein encoded by the ORF of the present invention. 

Alternatively, agents may be rationally selected or designed. As used 
herein, an agent is said to be "rationally selected or designed" when the agent is 
20 chosen based on the configuration of the particular protein. For example, one 
skilled in the art can readily adapt currently available procedures to generate 
peptides, pharmaceutical agents and the like capable of binding to a specific peptide 
sequence in order to generate rationally designed antipeptide peptides, for example 
see Hurby et al. y "Application of Synthetic Peptides: Antisense Peptides," in 
25 Synthetic Peptides, A User's Guide, W. H. Freeman, NY (1992), pp. 289-307, 
and Kaspczak et ai, Biochemistry 25:9230-8 (1989), or pharmaceutical agents, or 
the like. 

In addition to the foregoing, one class of agents of the present invention, as 
broadly described, can be used to control gene expression through binding to one 
30 of the ORFs or EMFs of the present invention. As described above, such agents 
can be randomly screened or rationally designed/selected. Targeting the ORF or 
EMF allows a skilled artisan to design sequence specific or element specific agents, 
modulating the expression of either a single ORF or multiple ORFs which rely on 
the same EMF for expression control. 
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One class of DNA binding agents are agents which contain base residues 
which hybridize or form a triple helix by binding to DNA or RNA. Such agents 
can be based on the classic phosphodiester, ribonucleic acid backbone, or can be a 
variety of sulfhydryl or polymeric derivatives which have base attachment capacity. 

5 Agents suitable for use in these methods usually contain 20 to 40 bases and 

are designed to be complementary to a region of the gene involved in transcription 
(triple helix - see Lee et al, NucL Acids Res. 6:3073 (1979); Cooney et a/., 
Science 241:456 (1988); and Dervan et a/., Science 257:1360 (1991)) or to the 
mRNA itself (antisense - Okano, J. Neurochem. 56:560 (1991); 

10 Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, 
Boca Raton, FL (1988)). Triple helix- formation optimally results in a shut-off of 
RNA transcription from DNA, while antisense RNA hybridization blocks 
translation of an mRNA molecule into polypeptide. Both techniques have been 
demonstrated to be effective in model systems. Information contained in the 

15 sequences of the present invention can be used to design antisense and triple helix- 
forming oligonucleotides, and other DNA binding agents. 

5* Pharmaceutical Compositions and Vaccines 

The present invention further provides pharmaceutical agents which can be 

20 used to modulate the growth or pathogenicity of Streptococcus pneumoniae, or 
another related organism, in vivo or in vitro. As used herein, a "pharmaceutical 
agent" is defined as a composition of matter which can be formulated using known 
techniques to provide a pharmaceutical compositions. As used herein, the 
"pharmaceutical agents of the present invention" refers the pharmaceutical agents 

25 which are derived from the proteins encoded by the ORFs of the present invention 
or are agents which are identified using the herein described assays. 

As used herein, a pharmaceutical agent is said to "modulate the growth 
pathogenicity of Streptococcus pneumoniae or a related organism, in vivo or in 
vitro" when the agent reduces the rate of growth, rate of division, or viability of 

30 the organism in question. The pharmaceutical agents of the present invention can 
modulate the growth or pathogenicity of an organism in many fashions, although 
an understanding of the underlying mechanism of action is not needed to practice 
the use of the pharmaceutical agents of the present invention. Some agents will 
modulate the growth by binding to an important protein thus blocking the biological 

35 activity of the protein, while other agents may bind to a component of the outer 
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surface of the organism blocking attachment or rendering the organism more prone 
to act the bodies nature immune system. Alternatively, the agent may comprise a 
protein encoded by one of the ORFs of the present invention and serve as a 
vaccine. The development and use of a vaccine based on outer membrane 

5 components are well known in the art. . 

As used herein, a "related organism" is a broad term which refers to any 
organism whose growth can be modulated by one of the pharmaceutical agents of 
the present invention. In general, such an organism will contain a homolog of the 
protein which is the target of the pharmaceutical agent or the protein used as a 

10 vaccine. As such, related organisms do not need to be bacterial but may be fungal 
or viral pathogens. 

The pharmaceutical agents and compositions of the present invention may 
be administered in a convenient manner, such as by the oral, topical, intravenous, 
intraperitoneal, intramuscular, subcutaneous, intranasal or intradermal routes. The 

15 pharmaceutical compositions are administered in an amount which is effective for 
treating and/or prophylaxis of the specific indication. In general, they are 
administered in an amount of at least about 1 mg/kg body weight and in most cases 
they will be administered in an amount not in excess of about 1 g/kg body weight 
per day. In most cases, the dosage is from about 0. 1 mg/kg to about 10 g/kg body 

20 weight daily, taking into account the routes of administration, symptoms, etc. 

The agents of the present invention can be used in native form or can be 
modified to form a chemical derivative. As used herein, a molecule is said to be a 
"chemical derivative" of another molecule when it contains additional chemical 
moieties not normally a part of the molecule. Such moieties may improve the 

25 molecule's solubility, absorption, biological half life, etc. The moiedes may 
alternatively decrease the toxicity of the molecule, eliminate or attenuate any 
undesirable side effect of the molecule, etc. Moieties capable of mediating such 
effects are disclosed in, among other sources, REMINGTONS 
PHARMACEUTICAL SCIENCES (1980) cited elsewhere herein. 

30 For example, such moieties may change an immunological character of the 

functional derivative, such as affinity for a given antibody. Such changes in 
immunomodulation activity are measured by the appropriate assay, such as a 
competitive type immunoassay. Modifications of such protein properties as redox 
or thermal stability, biological half-life, hydrophobicity, susceptibility to proteolytic 

35 degradation or the tendency to aggregate with carriers or into multimers also may 
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be effected in this way and can be assayed by methods well known to the skilled 
artisan. 

The therapeutic effects of the agents of the present invention may be 
obtained by providing the agent to a patient by any suitable means (e.g., inhalation, 

5 intravenously, intramuscularly, subcutaneously, enterally, or parenterally). It is 
preferred to administer the agent of the present invention so as to achieve an 
effective concentration within the blood or tissue in which the growth of the 
organism is to be controlled. To achieve an effective blood concentration, the 
preferred method is to administer the agent by injection. The administration may be 

10 by continuous infusion, or by single or multiple injections. 

In providing a patient with one of the agents of the present invention, the 
dosage of the administered agent will vary depending upon such factors as the 
patient's age, weight, height, sex, general medical condition, previous medical 
history, etc. In general, it is desirable to provide the recipient with a dosage of 

15 agent which is in the range of from about 1 pg/kg to 10 mg/kg (body weight of 
patient), although a lower or higher dosage may be administered. The 
therapeutically effective dose can be lowered by using combinations of the agents 
of the present invention or another agent. 

As used herein, two or more compounds or agents are said to be 

20 administered "in combination' 1 with each other when either (1) the physiological 
effects of each compound, or (2) the serum concentrations of each compound can 
be measured at the same time. The composition of the present invention can be 
administered concurrently with, prior to, or following the administration of the 
other agent. 

25 The agents of the present invention are intended to be provided to recipient 

subjects in an amount sufficient to decrease the rate of growth (as defined above) of 
the target organism. 

The administration of the agent(s) of the invention may be for either a 
"prophylactic" or "therapeutic" purpose. When provided prophylactically, the 

30 agent(s) are provided in advance of any symptoms indicative of the organisms 
growth. The prophylactic administration of the agent(s) serves to prevent, 
attenuate, or decrease the rate of onset of any subsequent infection. When 
provided therapeutically, the agent(s) are provided at (or shortly after) the onset of 
an indication of infection. The therapeutic administration of the compound(s) 
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serves to attenuate the pathological symptoms of the infection and to increase the 
rate of recovery. 

The agents of the present invention are administered to a subject, such as a 
mammal, or a patient, in a pharmaceutical^ acceptable form and in a therapeutically 
5 effective concentration. A composition is said to be "pharmacologically acceptable" 
if its administration can be tolerated by a recipient patient. Such an agent is said to 
be administered in a "therapeutically effective amount" if the amount administered 
is physiologically significant. An agent is physiologically significant if its presence 
results in a detectable change in the physiology of a recipient patient. 

10 The agents of the present invention can be formulated according to known 

methods to prepare pharmaceutically useful compositions, whereby these materials, 
or their functional derivatives, are combined in a mixture with a pharmaceutically 
acceptable carrier vehicle. Suitable vehicles and their formulation, inclusive of 
other human proteins, e.g., human serum albumin, are described, for example, in 

15 REMINGTON'S PHARMACEUTICAL SCIENCES, 16 th Ed., Osol, A., Ed., 
Mack Publishing, Easton PA (1980). In order to form a pharmaceutically 
acceptable composition suitable for effective administration, such compositions will 
contain an effective amount of one or more of the agents of the present invention, 
together with a suitable amount of carrier vehicle. 

20 Additional pharmaceutical methods may be employed to control the duration 

of action. Control release preparations may be achieved through the use of 
polymers to complex or absorb one or more of the agents of the present invention. 
The controlled delivery may be effectuated by a variety of well known techniques, 
including formulation with macromolecules such as, for example, polyesters, 

25 polyamino acids, polyvinyl, pyrrolidone, ethylenevinylacetate, methylcellulose, 
carboxymethylcellulose, or protamine, sulfate, adjusting the concentration of the 
macromolecules and the agent in the formulation, and by appropriate use of 
methods of incorporation, which can be manipulated to effectuate a desired time 
course of release. Another possible method to control the duration of action by 

30 controlled release preparations is to incorporate agents of the present invention into 
particles of a polymeric material such as polyesters, polyamino acids, hydrogels, 
poly(lactic acid) or ethylene vinylacetate copolymers. Alternatively, instead of 
incorporating these agents into polymeric particles, it is possible to entrap these 
materials in microcapsules prepared, for example, by coacervation techniques or by 

35 interfacial polymerization with, for example, hydroxymethylcellulose or gelatine- 
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microcapsules and poly(methylmethacylate) microcapsules, respectively, or in 
colloidal drug delivery systems, for example, liposomes, albumin microspheres, 
microemulsions, nanoparticles, and nanocapsules or in macroemulsions. Such 
techniques are disclosed in REMINGTON'S PHARMACEUTICAL SCIENCES 
5 (1980). 

The invention further provides a pharmaceutical pack or kit comprising one 
or more containers filled with one or more of the ingredients of the pharmaceutical 
compositions of the invention. Associated with such containers) can be a notice in 
the form prescribed by a governmental agency regulating the manufacture, use or 
10 sale of pharmaceuticals or biological products, which notice reflects approval by 
the agency of manufacture, use or sale for human administration. 

In addition, the agents of the present invention may be employed in 
conjunction with other therapeutic compounds. 

15 6. Shot-Gun Approach to Megabase DNA Sequencing 

The present invention further demonstrates that a large sequence can be 
sequenced using a random shotgun approach. This procedure, described in detail 
in the examples that follow, has eliminated the up front cost of isolating and 
ordering overlapping or contiguous subclones prior to the start of the sequencing 
20 protocols. 

Certain aspects of the present invention are described in greater detail in the 
examples that follow. The examples are provided by way of illustration. Other 
aspects and embodiments of the present invention are contemplated by the 
inventors, as will be clear to those of skill in the art from reading the present 
25 disclosure. 

ILLUSTRATIVE EXAMPLES 

LIBRARIES AND SEQUENCING 
30 1. Shotgun Sequencing Probability Analysis 

The overall strategy for a shotgun approach to whole genome sequencing 

follows from the Lander and Waterman (Landerman and Waterman, Genomics 

2:231 (1988)) application of the equation for the Poisson distribution. According 

to this treatment, the probability, P , that any given base in a sequence of size L, in 

35 nucleotides, is not sequenced after a certain amount, n, in nucleotides, of random 

0 
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sequence has been determined can be calculated by the equation P = e" m , where m 
is L/n, the fold coverage. For instance, for a genome of 2.8 Mb, m=l when 2.8 
Mb of sequence has been randomly generated (IX coverage). A?that point, P = 
e" 1 = 0.37. The probability that any given base has not been sequenced is the same 
5 as the probability that any region of the whole sequence L has not been determined 
and, therefore, is equivalent to the fraction of the whole sequence that has yet to be 
determined. Thus, at one-fold coverage, approximately 37% of a polynucleotide of 
size L, in nucleotides has not been sequenced. When 14 Mb of sequence has been 
generated, coverage is 5X for a 2.8 Mb and the unsequenced fraction drops to 

10 .0067 or 0.67%. 5X coverage of a 2.8 Mb sequence can be attained by sequencing 
approximately 17,000 random clones from both insert ends with an average 
sequence read length of 410 bp. 

Similarly, the total gap length, G, is determined by the equation G = Le* m , 
and the average gap size, g, follows the equation, g = L/n. Thus, 5X coverage 

15 leaves about 240 gaps averaging about 82 bp in size in a sequence of a 
polynucleotide 2.8 Mb long. 

The treatment above is essentially that of Lander and Waterman, Genomics 
2: 231 (1988). 

20 2. Random Library Construction 

In order to approximate the random model described above during actual 
sequencing, a nearly ideal library of cloned genomic fragments is required. The 
following library construction procedure was developed to achieve this end. 

Streptococcus pneumoniae DNA is prepared by phenol extraction. A 

25 mixture containing 200 (Xg DNA in 1 .0 ml of 300 mM sodium acetate, 10 mM Tris- 
HC1, 1 mM Na-EDTA, 50% glycerol is processed through a nebulizer (IPI Medical 
Products) with a stream of nitrogen adjusted to 35 Kpa for 2 minutes. The 
sonicated DNA is ethanol precipitated and redissolved in 500 |xl TE buffer. 

To create blunt-ends, a 100 \il aliquot of the resuspended DNA is digested 

30 with 5 units of BAL3 1 nuclease (New England BioLabs) for 10 min at 30 6 C in 200 
M-l BAL31 buffer. The digested DNA is phenol-extracted, ethanol-precipitated, 
redissolved in 100 \il TE buffer, and then size-fractionated by electrophoresis 
through a 1.0% low melting temperature agarose gel. The section containing DNA 
fragments 1 .6-2.0 kb in size is excised from the gel, and the LGT agarose is melted 

35 and the resulting solution is extracted with phenol to separate the agarose from the 
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DNA. DNA is ethanol precipitated and redissolved in 20 ^1 of TE buffer for 
ligation to vector. 

A two-step ligation procedure is used to produce a plasmid library with 
97% inserts, of which >99% were single inserts. The first ligation mixture (50 ul) 
5 contains 2 jxg of DNA fragments, 2 |ig pUC18 DNA (Phaimacia) cut with Smal 
and dephosphorylated with bacterial alkaline phosphatase, and 10 units of T4 ligase 
(GIBCO/BRL) and is incubated at 14°C for 4 hr. The ligation mixture then is 
phenol extracted and ethanol precipitated, and the precipitated DNA is dissolved in 
20 \il TE buffer and electrophoresed on a 1.0% low melting agarose gel. Discrete 
10 bands in a ladder are visualized by ethidium bromide-staining and UV illumination 
and identified by size as insert (I), vector (v), v+I, v+2i, v+3i, etc. The portion of 
the gel containing v+I DNA is excised and the v+I DNA is recovered and 
resuspended into 20 |il TE. The v+I DNA then is blunt-ended by T4 polymerase 
treatment for 5 min. at 37°C in a reaction mixture (50 ul) containing the v+I linears, 

15 500 \xM each of the 4 dNTPs, and 9 units of T4 polymerase (New England 
BioLabs), under recommended buffer conditions. After phenol extraction and 
ethanol precipitation the repaired v+I linears are dissolved in 20 fil TE. The final 
ligation to produce circles is carried out in a 50 joJ reaction containing 5 |il of v+I 
linears and 5 units of T4 ligase at 14°C overnight. After 10 min. at 70°C the 

20 following day, the reaction mixture is stored at -20°C. 

This two-stage procedure results in a molecularly random collection of 
single-insert plasmid recombinants with minimal contamination from double-insert 
chimeras (<1%) or free vector (<3%). 

Since deviation from randomness can arise from propagation the DNA in 

25 the host, E. coli host cells deficient in all recombination and restriction functions 
(A. Greener, Strategies 3 (1):5 (1990)) are used to prevent rearrangements, 
deletions, and loss of clones by restriction. Furthermore, transformed cells are 
plated directly on antibiotic diffusion plates to avoid the usual broth recovery phase 
which allows multiplication and selection of the most rapidly growing cells. 

30 Plating is carried out as follows. A 100 (il aliquot of Epicurian Coli SURE 

II Supercompetent Cells (Stratagene 200152) is thawed on ice and transferred to a 
chilled Falcon 2059 tube on ice. A 1.7 jxi aliquot of 1.42 M beta-mercaptoethanol 
is added to the aliquot of cells to a final concentration of 25 mM. Cells are 
incubated on ice for 10 min. A 1 |al aliquot of the final ligation is added to the cells 

35 and incubated on ice for 30 min. The cells are heat pulsed for 30 sec. at 42°C and 
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placed back on ice for 2 min. The outgrowth period in liquid culture is eliminated 
from this protocol in order to minimize the preferential growth of any given 
transformed ceil. Instead the transformation mixture is plated directly on a nutrient 
rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar: 20 g 
5 tryptone, 5 g yeast extract, 0.5 g NaCl, 1 .5% Difco Agar per liter of media). The 5 
ml bottom layer is supplemented with 0.4 ml of 50 mg/ml ampicillin per 100 ml 
SOB agar. The 15 ml top layer of SOB agar is supplemented with 1 ml X-Gal 
(2%), 1 ml MgCl (1 M), and 1 ml MgSO /100 ml SOB agar. The 15 ml top layer 

is poured just prior to plating. Our titer is approximately 100 colonies/10 \il aliquot 

2 4 
10 of transformation. 

All colonies are picked for template preparation regardless of size. Thus, 

only clones lost due to "poison" DNA or deleterious gene products are deleted from 

the library, resulting in a slight increase in gap number over that expected. 

15 3. Random DNA Sequencing 

High quality double stranded DNA plasmid templates are prepared using a 
"boiling bead" method developed in collaboration with Advanced Genetic 
Technology Corp. (Gaithersburg, MD) (Adams et al* Science 252:1651 (1991); 
Adams etai, Nature 355:632 (1992)). Plasmid preparation is performed in a 96- 

20 well format for all stages of DNA preparation from bacterial growth through final 
DNA purification. Template concentration is determined using Hoechst Dye and a 
Millipore Cytofluor. DNA concentrations are not adjusted, but low-yielding 
templates are identified where possible and not sequenced. 

Templates are also prepared from two Streptococcus pneumoniae lambda 

25 genomic libraries. An amplified library is constructed in the vector Lambda GEM- 
12 (Promega) and an unamplified library is constructed in Lambda DASH II 
(Stratagene). In particular, for the unamplified lambda library, Streptococcus 
pneumoniae DNA (> 100 kb) is partially digested in a reaction mixture (200 ul) 
containing 50 |Xg DNA, IX Sau3AI buffer, 20 units Sau3AI for 6 min. at 23°C. 

30 The digested DNA was phenol-extracted and electrophoresed on a 0.5% low 
melting agarose gel at 2V/cm for 7 hours. Fragments from 15 to 25 kb are excised 
and recovered in a final volume of 6 ul. One |il of fragments is used with 1 fil of 
DASHII vector (Stratagene) in the recommended ligation reaction. One |ul of the 
ligation mixture is used per packaging reaction following the recommended 

35 protocol with the Gigapack II XL Packaging Extract (Stratagene, #2277 1 1 ). Phage 
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are plated directly without amplification from the packaging mixture (after dilution 
with 500 jlU of recommended SM buffer and chloroform treatment). Yield is about 
2.5x1 0$ pfu/ul. The amplified library is prepared essentially as above except the 
lambda GEM- 12 vector is used. After packaging, about 3.5x1 0^ pfu are plated on 
5 the restrictive NM539 host. The lysate is harvested in 2 ml of SM buffer and 
stored frozen in 7% dimethylsulfoxide. The phage titer is approximately 1x10^ 
pfu/ml. 

Liquid lysates (100 jll) are prepared from randomly selected plaques (from 
the unamplified library) and template is prepared by long-range PCR using T7 and 

1 0 T3 vector-specific primers. 

Sequencing reactions are carried out on plasmid and/or PCR templates 
using the AB Catalyst LabStation with Applied Biosystems PRISM Ready 
Reaction Dye Primer Cycle Sequencing Kits for the M 13 forward (Ml 3-21) and 
the M 13 reverse (M13RP1) primers (Adams et aL, Nature 368:474 (1994)). Dye 

15 terminator sequencing reactions are carried out on the lambda templates on a 
Perkin-Elmer 9600 Thermocycler using the Applied Biosystems Ready Reaction 
Dye Terminator Cycle Sequencing kits. T7 and SP6 primers are used to sequence 
the ends of the inserts from the Lambda GEM- 12 library and T7 and T3 primers are 
used to sequence the ends of the inserts from the Lambda DASH II library. 

20 Sequencing reactions are performed by eight individuals using an average of 
fourteen AB 373 DNA Sequencers per day. All sequencing reactions are analyzed 
using the Stretch modification of the AB 373, primarily using a 34 cm well-to-read 
distance. The overall sequencing success rate very approximately is about 85% for 
M13-21 and M13RP1 sequences and 65% for dye-terminator reactions. The 

25 average usable read length is 485 bp for Ml 3-21 sequences, 445bp for M13RP1 
sequences, and 375 bp for dye-terminator reactions. 

Richards et aL, Chapter 28 in AUTOMATED DNA SEQUENCING AND 
ANALYSIS, M. D. Adams, C. Fields, J. C. Venter, Eds., Academic Press, 
London, (1994) described the value of using sequence from both ends of 

30 sequencing templates to facilitate ordering of contigs in shotgun assembly projects 
of lambda and cosmid clones. We balance the desirability of both-end sequencing 
(including the reduced cost of lower total number of templates) against shorter 
read-lengths for sequencing reactions performed with the M13RP1 (reverse) primer 
compared to the Ml 3-21 (forward) primer. Approximately one-half of the 

35 templates are sequenced from both ends. Random reverse sequencing reactions are 
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done based on successful forward sequencing reactions. Some M13RP1 
sequences are obtained in a semi-directed fashion: Ml 3-21: sequences pointing 
outward at the ends of contigs are chosen for M13RP1 sequencing in an effort to 
specifically order contigs. 

5 

4. Protocol for Automated Cycle Sequencing 

The sequencing is carried out using ABI Catalyst robots and AB 373 
Automated DNA Sequencers. The Catalyst robot is a publicly available 
sophisticated pipetting and temperature control robot which has been developed 

10 specifically for DNA sequencing reactions. The Catalyst combines pre-aliquoted 
templates and reaction mixes consisting of deoxy- and dideoxynucleotides, the 
thermostable Taq DNA polymerase, fluorescently-labelled sequencing primers, and 
reaction buffer. Reaction mixes and templates are combined in the wells of an 
aluminum 96-well thermocycling plate. Thirty consecutive cycles of linear 

15 amplification (i.e.., one primer synthesis) steps are performed including 
denaturation, annealing of primer and template, and extension; i.e., DNA 
synthesis. A heated lid with rubber gaskets on the thermocycling plate prevents 
evaporation without the need for an oil overlay. 

Two sequencing protocols are used: one for dye-labelled primers and a 

20 second for dye-labelled dideoxy chain terminators. The shotgun sequencing 
involves use of four dye-labelled sequencing primers, one for each of the four 
terminator nucleotide. Each dye-primer is labelled with a different fluorescent dye, 
permitting the four individual reactions to be combined into one lane of the 373 
DNA Sequencer for electrophoresis, detection, and base-calling. ABI currently 

25 supplies pre-mixed reaction mixes in bulk packages containing all the necessary 
non-template reagents for sequencing. Sequencing can be done with both plasmid 
and PGR- generated templates with both dye-primers and dye- terminators with 
approximately equal fidelity, although plasmid templates generally give longer 
usable sequences. 

30 Thirty-two reactions are loaded per AB373 Sequencer each day, for a total 

of 960 samples. Electrophoresis is run overnight following the manufacturer's 
protocols, and the data is collected for twelve hours. Following electrophoresis 
and fluorescence detection, the ABI 373 performs automatic lane tracking and base- 
calling. The lane-tracking is confirmed visually. Each sequence electropherogram 

35 (or fluorescence lane trace) is inspected visually and assessed for quality. Trailing 
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sequences of low quality are removed and the sequence itself is loaded via software 
to a Sybase database (archived daily to 8mm tape). Leading vector polylinker 
sequence is removed automatically by a software program. Average edited lengths 
of sequences from the standard ABI 373 are around 400 bp and depend mostly on 
5 the quality of the template used for the sequencing reaction. ABI 373 Sequencers 
converted to Stretch Liners provide a longer electrophoresis path prior to 
fluorescence detection and increase the average number of usable bases to 500-600 
bp. 

10 INFORMATICS 

L Data Management 

A number of information management systems for a large-scale sequencing 
lab have been developed. (For review see, for instance, Kerlavage et ai t 
Proceedings of the Twenty-Sixth Annual Hawaii International Conference on 

15 System Sciences, IEEE Computer Society Press, Washington D. C, 585 (1993)) 
The system used to collect and assemble the sequence data was developed using the 
Sybase relational database management system and was designed to automate data 
flow wherever possible and to reduce user error. The database stores and 
correlates all information collected during the entire operation from template 

20 preparation to final analysis of the genome. Because the raw output of the ABI 373 
Sequencers was based on a Macintosh platform and the data management system 
chosen was based on a Unix platform, it was necessary to design and implement a 
variety of multi- user, client-server applications which allow the raw data as well as 
analysis results to flow seamlessly into the database with a minimum of user effort. 

25 

2. Assembly 

An assembly engine (TIGR Assembler) developed for the rapid and 
accurate assembly of thousands of sequence fragments was employed to generate 
contigs. The TIGR assembler simultaneously clusters and assembles fragments of 

30 the genome. In order to obtain the speed necessary to assemble more than 10^ 
fragments, the algorithm builds a hash table of 12 bp oligonucleotide subsequences 
to generate a list of potential sequence fragment overlaps. The number of potential 
overlaps for each fragment determines which fragments are likely to fall into 
repetitive elements. Beginning with a single seed sequence fragment, TIGR 

35 Assembler extends the. current contig by attempting to add the best matching 
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fragment based on oligonucleotide content. The contig and candidate fragment are 
aligned using a modified version of the Smith- Waterman algorithm which provides 
for optimal gapped alignments (Waterman, M. S., Methods in Enzymology 
164:165 (1988)). The contig is extended by the fragment only if strict criteria for 
5 the quality of the match are met. The match criteria include the minimum length of 
overlap, the maximum length of an unmatched end, and the minimum percentage 
match. These criteria are automatically lowered by the algorithm in regions of 
minimal coverage and raised in regions with a possible repetitive element. The 
number of potential overlaps for each fragment determines which fragments are 

10 likely to fall into repetitive elements. Fragments representing the boundaries of 
repetitive elements and potentially chimeric fragments are often rejected based on 
partial mismatches at the ends of alignments and excluded from the current contig. 
TIGR Assembler is designed to take advantage of clone size information coupled 
with sequencing from both ends of each template. It enforces the constraint that 

15 sequence fragments from two ends of the same template point toward one another 
in the contig and are located within a certain range of base pairs (definable for each 
clone based on the known clone size range for a given library). 

The process resulted in 391 contigs as represented by SEQ ID NOs: 1-391. 

20 3. Identifying Genes 

The predicted coding regions of the Streptococcus pneumoniae genome 
were initially defined with the program GeneMark, which finds ORFs using a 
probabilistic classification technique. The predicted coding region sequences were 
used in searches against a database of all nucleotide sequences from GenBank 

25 (October, 1997), using the BLASTN search method to identify overlaps of 50 or 
more nucleotides with at least a 95% identity. Those ORFs with nucleotide 
sequence matches are shown in Table 1. The ORFs without such matches were 
translated to protein sequences and compared to a non-redundant database of 
known proteins generated by combining the Swiss-prot, PIR and GenPept 

30 databases. ORFs that matched a database protein with BLASTP probability less 
than or equal to 0.01 are shown in Table 2. The table also lists assigned functions 
based on the closest match in the databases. ORFs that did not match protein or 
nucleotide sequences in the databases at these levels are shown in Table 3. 



WO 98/18931 



49 



PCT/US97/19588 



ILLUSTRATIVE APPLICATIONS 

1. Production of an Antibody to a Streptococcus pneumoniae 
Protein 

Substantially pure protein or polypeptide is isolated from the transfected or 
5 transformed cells using any one of the methods known in the art. The protein can 
also be produced in a recombinant prokaryotic expression system, such as E. coli, 
or can be chemically synthesized. Concentration of protein in the final preparation 
is adjusted, for example, by concentration on an Amicon filter device, to the level 
of a few micrograms/ml. Monoclonal or polyclonal antibody to the protein can 
1 0 then be prepared as follows. 

2. Monoclonal Antibody Production by Hybridoma Fusion 

Monoclonal antibody to epitopes of any of the peptides identified and 
isolated as described can be prepared from murine hybridomas according to the 

15 classical method of Kohler, G. and Milstein, C, Nature 256:495 (1975) or 
modifications of the methods thereof. Briefly, a mouse is repetitively inoculated 
with a few micrograms of the selected protein over a period of a few weeks. The 
mouse is then sacrificed, and the antibody producing cells of the spleen isolated. 
The spleen cells are fused by means of polyethylene glycol with mouse myeloma 

20 , cells, and the excess unfused cells destroyed by growth of the system on selective 
media comprising aminopterin (HAT media). The successfully fused cells are 
diluted and aliquots of the dilution placed in wells of a microtiter plate where 
growth of the culture is continued. Antibody-producing clones are identified by 
detection of antibody in the supernatant fluid of the wells by immunoassay 

25 procedures, such as ELISA, as originally described by Engvall, E., Meth. 
EnzymoL 70:419 (1980), and modified methods thereof. Selected positive clones 
can be expanded and their monoclonal antibody product harvested for use. Detailed 
procedures for monoclonal antibody production are described in Davis, L. et ai } 
Basic Methods in Molecular Biology, Elsevier, New York. Section 21-2 (1989). 

30 
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3. Polyclonal Antibody Production by Immunization 

Polyclonal antiserum containing antibodies to heterogenous epitopes of a 
single protein can be prepared by immunizing suitable animals with the expressed 
protein described above, which can be unmodified or modified to enhance 

5 immunogenicity. Effective polyclonal antibody production is affected by many 
factors related both to the antigen and the host species. For example, small 
molecules tend to be less immunogenic than others and may require the use of 
carriers and adjuvant. Also, host animals vary in response to site of inoculations 
and dose, with both inadequate or excessive doses of antigen resulting in low titer 

10 antisera. Small doses (ng level) of antigen administered at multiple intradermal 
sites appears to be most reliable. An effective immunization protocol for rabbits 
can be found in Vaitukaitis, J. et a/., 7. Clin. Endocrinol. Metab. 33:988-991 
(1971). 

Booster injections can be given at regular intervals, and antiserum harvested 

15 when antibody titer thereof, as determined semi-quantitatively, for example, by 
double immunodiffusion in agar against known concentrations of the antigen, 
begins to fall. See, for example, Ouchterlony, O. etai, Chap. 19 in: Handbook of 
Experimental Immunology, Wier, D., ed, Blackwell (1973). Plateau concentration 
of antibody is usually in the range of 0.1 to 0.2 mg/ml of serum (about 12M). 

20 Affinity of the antisera for the antigen is determined by preparing competitive 
binding curves, as described, for example, by Fisher, D., Chap. 42 in: Manual of 
Clinical Immunology, second edition, Rose and Friedman, eds., Amer. Soc. For 
Microbiology, Washington, D. C. (1980) 

Antibody preparations prepared according to either protocol are useful in 

25 quantitative immunoassays which determine concentrations of antigen-bearing 
substances in biological samples; they are also used semi- quantitatively or 
qualitatively to identify the presence of antigen in a biological sample. In addition, 
antibodies are useful in various animal models of pneumococcal disease as a means 
of evaluating the protein used to make the antibody as a potential vaccine target or 

30 as a means of evaluating the antibody as a potential immunotherapeutic or 
immunoprophylactic reagent. 
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4. Preparation of PCR Primers and Amplification of DNA 

Various fragments of the Streptococcus pneumoniae genome, such as those 
of Tables 1-3 and SEQ ID NOS: 1-391 can be used, in accordance with the present 
invention, to prepare PCR primers for a variety of uses. The PCR primers are 
5 preferably at least 15 bases, and more preferably at least 18 bases in length. When 
selecting a primer sequence, it is preferred that the primer pairs have approximately 
the same G/C ratio, so that melting temperatures are approximately the same. The 
PCR primers and amplified DNA of this Example find use in the Examples that 
follow. 

10 

5. Gene expression from DNA Sequences Corresponding to 

ORFs 

A fragment of the Streptococcus pneumoniae genome provided in Tables 1- 
3 is introduced into an expression vector using conventional technology. 

15 Techniques to transfer cloned sequences into expression vectors that direct protein 
translation in mammalian, yeast, insect or bacterial expression systems are well 
known in the art. Commercially available vectors and expression systems are 
available from a variety of suppliers including Stratagene (La Jolla, California), 
Promega (Madison, Wisconsin), and Invitrogen (San Diego, California). If 

20, desired, to enhance expression and facilitate proper protein folding, the codon 
context and codon pairing of the sequence may be optimized for the particular 
expression organism, as explained by Hatfield et aL, U. S. Patent No. 5,082,767, 
incorporated herein by this reference. 
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The following is provided as one exemplary method to generate 
polypeptide(s) from cloned ORFs of the Streptococcus pneumoniae genome 
fragment. Bacterial ORFs generally lack a poly A addition signal. The addition 
signal sequence can be added to the construct by, for example, splicing out the poly 

5 A addition sequence from pSG5 (Stratagene) using Bgll and Sail restriction 
endonuclease enzymes and incorporating it into the mammalian expression vector 
pXTl (Stratagene) for use in eukaryotic expression systems. pXTl contains the 
LTRs and a portion of the gag gene of Moloney Murine Leukemia Virus. The 
positions of the LTRs in the construct allow efficient stable transfection. The 

10 vector includes the Herpes Simplex thymidine kinase promoter and the selectable 
neomycin gene. The Streptococcus pneumoniae DN A is obtained by PCR from the 
bacterial vector using oligonucleotide primers complementary to the Streptococcus 
pneumoniae DNA and containing restriction endonuclease sequences for PstI 
incorporated into the 5 1 primer and Bglll at the 5* end of the corresponding 

15 Streptococcus pneumoniae DNA 3' primer, taking care to ensure that the 
Streptococcus pneumoniae DNA is positioned such that its followed with the poly 
A addition sequence. The purified fragment obtained from the resulting PCR 
reaction is digested with PstI, blunt ended with an exonuclease, digested with 
Bglll, purified and ligated to pXTl, now containing a poly A addition sequence 

20 and digested Bglll. 

The ligated product is transfected into mouse NIH 3T3 cells using 
Lipofectin (Life Technologies, Inc., Grand Island, New York) under conditions 
outlined in the product specification. Positive transfectants are selected after 
growing the transfected cells in 600 ug/ml G418 (Sigma, St. Louis, Missouri). 

25 The protein is preferably released into the supernatant. However if the protein has 
membrane binding domains, the protein may additionally be retained within the cell 
or expression may be restricted to the cell surface. Since it may be necessary to 
purify and locate the transfected product, synthetic 15-mer peptides synthesized 
from the predicted Streptococcus pneumoniae DNA sequence are injected into mice 

30 to generate antibody to the polypeptide encoded by the Streptococcus pneumoniae 
DNA. 
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Alternatively and if antibody production is not possible, the Streptococcus 
pneumoniae DNA sequence is additionally incorporated into eukaryotic expression 
vectors and expressed as, for example, a globin fusion. Antibody to the globin 
moiety then is used to purify the chimeric protein. Corresponding protease 

5 cleavage sites are engineered between the globin moiety and the polypeptide 
encoded by the Streptococcus pneumoniae DNA so that the latter may be freed 
from the formed by simple protease digestion. One useful expression vector for 
generating globin chimerics is pSG5 (Stratagene). This vector encodes a rabbit 
globin. Intron II of the rabbit globin gene facilitates splicing of the expressed 

10 transcript, and the polyadenylation signal incorporated into the construct increases 
the level of expression. These techniques are well known to those skilled in the art 
of molecular biology. Standard methods are published in methods texts such as 
Davis et al, cited elsewhere herein, and many of the methods are available from the 
technical assistance representatives from Stratagene, Life Technologies, Inc., or 

15 Promega. Polypeptides of the invention also may be produced using in vitro 
translation systems such as in vitro ExpressTM Translation Kit (Stratagene). 

While the present invention has been described in some detail for purposes 
of clarity and understanding, one skilled in the art will appreciate that various 
changes in form and detail can be made without departing from the true scope of 

20'. the invention. 

All patents, patent applications and publications referred to above are 
hereby incorporated by reference. 
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(i> APPLICANT: Charles Kunsch 
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Patrick S. Dillon 
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(ii) TITLE OF INVENTION: Streptococcus pneumoniae Polynucleotides and Sequences 

(iii) NUMBER OF SEQUENCES: 391 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Human Genome Sciences, Inc. 

(B) STREET: 9410 Key West Avenue 

(C) CITY: Rockville 

(D) STATE: Maryland 

(E) COUNTRY : USA 

(F) ZIP: 20850 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch. 1.4Mb storage 

(B) COMPUTER: HP Vectra 486/33 

(C) OPERATING SYSTEM: MSDOS version 6.2 

(D) SOFTWARE : ASCII Text 
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(2) INFORMATION FOR SBQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5625 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
CCAAGCAAAA CCAGCTACAG CTAAAGGAAC TTACGTAACA AACTTGACTA 
TCAAGGTGTT GGTATCAAAG TTGACGTAAA CTCACTTTAA TCAGTAGTTA 
AAAAAGTTGA AGACGCTATG TCTCAACTTT TTTTGATGTA CGACGGGCAT 
AGATGTGTAC TATTCTAGTT TCAATCTACT ATAGTAGCTC AGAAGTCGGT 
GCTATATCAA AACCAGTCCT TGAAAAACGT GGACTGGTTT CGTGTTTGGA 
GAACGACATG CGTTAAAAGT TAGTTGAACC GCCGTATGCC GAACGGACGT 
GAGAGGGGCT AGAGATTATC CCCTACTCGA TTTCGAAATC TAGTGGAATG 
AGTCCATCGA GCTTTCTAAT ACTCTTCGAA AATCTCTTCA AACCACGTCA 
GCCGTGCGTA TGGTTACTGA CTTCGTCAGT TCTATCCACA ACCTCAAAAC 
GCTGACTACG TCAGTTCCAT CTACAACCTC AAAACAGTGT TTTGAGCAAC 
TTTCCTAGTT TGCTCTTTGG TTTTCATTGA GTATAACACA TTGTTAGAAG 
TTTCCTAATC AGTTTGTTCA CATTTACCTT CGATATATTA TATCCCATAG 
TCATACAGAT GATTATAGTC ATGGAGCCGT AAAACTTAGT GTTTCTTTAG 
TGCCATGAAA AAAATATTTG TAACTGTAAT AGGATATTTT GAAATAAATA 
TATCACCGAT ATTCTATACG TAAATGGTAC TGCTATTCTT TATCTTTATT 
TGTTTCAATA GTTTCGGCAA TTGATAGCAG TGAAGCAATG TTGCTACCTA 
TTTAGAGTTA CTAGATAAAT CTCAACCTTT TGAAGAAGAA TAATTTATTA 
TTGAGGGTAA GGAAAAGTAA AAGCAGTAAG AAAAATGTCT TGCATTATAC 
TGGGAATGAG TGGATGGATT GAATAAAATT TGATTAAGAG TGGATGATTT 
TATTATTGGA CAGTTAGTCT TGAAGTAGTC TAAGAATTAG GTTATAATCA 
TGCTAATAAT GAGGAGGTTA GTTTATGTAT AGTAGACTGA ATCTAAAATA 
ATTGCTAAAA CATTTATAGA AATTAATTTT ACTTTCCCAA TCGATTTGTT 
TTCAATCCGC TATATATTAT GGTATCGAAT CTTCATCAGA ATGATAAAAT 
ATATCTGATT ACAAACAGAA TATGAAAGCT TTTTATATCA CTATTGAAAA 



TCACAACTAC 
AAGTAATGTA 
GTTGTATAGT 
ACTTAAACGT 
TTATTACCTT 
ACGGTGGTGT 
AATCTGGAAT 
ACGTCGCCTT 
AGTGTTTTGA 
CTGCGGCTAG 
TTGGTTTAAA 
TTAAGGTTGG 
TTGACAAAGA 
TAGATGAAAA 
TACGTTCAAT 
TCATTAATGT 
GCTCACTAAA 
AGCAACCTTT 
ATCTGTAGAT 
GTAGAAGCCT 
GTACGAAACA 
CTCATCTTAT 
TAATCAATTG 
ATTTATACGA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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GATGATGAAA 
AGTCTAGATC 
AAATTTATTG 
AAGAGGTGGT 
ACTTCTTTTT 
ATATTTTCTG 
GAGAAGATGA 
TTTCATGAGA 
TTGGACTATA 
AGTAAGGATT 
GATAGATATG 
GGAACCGGGC 
AGCGGTCAAG 
AAATGGTGCT 
TGGTTTTTGT 
TATTGGCGTA 
CTACACTGTC 
CTTTTTGAAT 
ACTGGGACAA 
TTTTTTAGAT 
AATTACTCAG 
TGATATTGAG 
TGGAACGGTG 
GCTACCAGGT 
TAGACAAGGA 
TATCAAGCAA 
TGAGGATATT 
GTTATAAACC 
TTCTCTATCG 



GCCTTAAGTG 
AGATTGAAGC 
GATTTTATAA 
CGAGTTGGTT 
CATGAATGAG 
GATAGAGGGA 
GTAGGTTGGT 
GAAATCCTAA 
ATAGGTTGGT 
TAGAATATTT 
GCAATGATAG 
TTGAAGGGGG 
GATTTGACCT 
GGGAAGTCGA 
CGGATTAACG 
GTCTTTGGAC 
TTAAAAGAGA 
GAAGTCTTGG 
CGGATGCGGG 
GAGCCGACCA 
ATCAATCAAG 
CAACTTTGTG 
AGCCAACTCA 
CAAAGTCATC 
AACAGCCTCA 
ACCCTGTCTG 
ATCCGTCGCT 
CTTTATCAAT 
GATTGGCGAT 



TTATTTTATA 
TGATAAAACG 
TGAGATTATT 
TAGGTAGTCG 
GTAAAAGAGC 
GTATCCGATT 
AATTTAAACT 
TTTCACAATC 
ATAAAGCCTT 
GTAGTTAAAA 
AAGTGGAACA 
CTTTGCGCTC 
TTGAGGTTCC 
CAACCATTAA 
GCAAGATTCC 
AACGCACCCA 
TTTATGATGT 
ATTTGAAGGA 
CGGATATTGC 
TTGGTTTGGA 
AGGAAGAAAC 
ATCGGATTTT 
AGGAGACCTT 
TCGTCTCTCA 
ACATTGAATT 
ATTTTGAAAT 
TCTACCGAAA 
GCAGGGGTTC 
GTCATGGGGG 



151 
AAGGTTATTT 
ATACAAAGAA 
TGTGAGGAAA 
ATGCGTGAGT 
AGGTATTGTT 
TTATGATCAA 
ATTAAACAGA 
CATAGGCAAA 
CTGTAGTAAT 
ACACAATGTT 
TCTTCAGAAA 
CTTTATTCAT 
AAAAGGGCAG 
AATGCTGACA 
CCAGGACAAT 
GCTATGGTGG 
GCCAGACTCG 
CTTTATCAAG 
GGCCTCCTTG 
CGTTTCGGTT 
TACCATTCTT 
CATGATTGAC 
TGGTAAGATG 
CTATGACGGT 
TGATAGTTCT 
CCGCGATTTG 
GGAGCTCTAG 
AGGAGTTGAT 
CTTTTGTGGC 



CAAGTCGTTC 
AATATTCAAG 
ATAGTTTCCT 
TGATAATTCT 
TAGAGACAAT 
AGTTAATACC 
ATTTTTGATT 
CGCTTGCATT 
AAAATGTAGA 
GCTATTCCTT 
AATTTTGTGA 
CCTGAAAAGC 
ATTTTAGGAT 
GGAATTTTGA 
CGGCAAGATT 
GATTTGGCTC 
CTCTTTCATA 
GATCCCGTGC 
CTCCACAATC 
AAGGATAATA 
TTGACCACTC 
AAGGGGCAAG 
AAGACTCTCT 
CTGTCTGATA 
CGCTACCAGT 
AAGATGGTGG 
GATGATCAAA 
TACTTACCGA 
CTTTTATCTC 



CAAGGTAACA 
TGAGCTAAAA 
ACATGTACGA 
CAGGGTATGG 
CATTCTGAGC 
GCCCTCTGGT 
AAAAGTATTA 
TCGTTTTTTA 
AGGTGTAGAA 
ACGATAGGGA 
AGACTGTTAA 
AGACCTTTGA 
TTATCGGGGC 
AACCAACATC 
ATGTCAAAGA 
TGCAAGAGAC 
AGCGTATGGA 
GGACTCTTTC 
CCAAGGTTCT 
TTCGTCGGGC 
ACGATTTGAG 
AGATTTTTGA 
CTTTTGAACT 
TGACCATTGA 
CAGCTGACAT 
ATACGGATAT. 
TTGTGGAGAC 
GTCAACTTTA 
TGGAAGGCTG 



1500 

1550 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 
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TCTTTGATTC 
ACATCATCAT 
GGGAGGAGGT 
CCTCCTATCT 
CATTTTTAAG 
TAGGATTAAC 
TTAATATTTG 
TTAAGACTTC 
AGGTTGTTTC 
TGATCATTGT 
TCTGGCTCTT 
TCACCATTCA 
CAATACATCA 
TTTCTGACTC 
CTAGAAGGCT 
GGAATGGACC 
GGGGAGTTTG 
ACCTTTCAGA 
GTGACCAGCA 
GCGACCTTGA 
CAGTCAGGCG 
TCTATTTACA 
GCCTACTATC 
TTGATGTTGA 
GATTCCTACG 
TTATGATGTT 
AGGTTTTGAA 
AAGGAACCAC 
TAAGCTTATC 
TTATCAAGGT 



TTCGCAAGAG 
GAGTTTTGTG 
CAAGGATGGC 
TTTCACCGAG 
TGTCATTGTC 
TGTCATTTAT 
CTTTGGATTT 
CATAGTGGCT 
AGATATTCTC 
TGGAAAATAC 
AGTGATGGTG 
AGGAGGTTAG 
AACAAATCAT 
AAGGCTTGAA 
GGACCTTTCA 
ATCTCTTTTT 
ACAAGTATCT 
TTGATGCCTT 
TTGTTTGGAC 
TTTATACTTC 
CCATGATTTA 
ATTCTCTTCT 
CAGCTAGCTA 
TTTCTCTGGT 
AAAGTGCGGG 
TGTAATTGAA 
GGATTTGCCA 
GACACTGCAA 
CTATTCGAGT 
AGAAAAATTG 



TCTTTGATTC 
ACCAATCTTC 
TCCATTATCA 
CTTGGTTCCA 
TTGATGAAAA 
CTTTTTAGCT 
TCAGCCTTTG 
TTTATGTCGG 
TCCTTTTTGC 
GATGCCAGTC 
GGATTGTCTC 
TATGAAAAAA 
GGAATATAAG 
TCTCTTGTTT 
AGAGATAGCT 
TGACAATCTC 
GACTCGTCCC 
GGGTGAACTC 
TCTTCCAAAA 
TCTTAAAATC 
CATCTTCTAT 
TCGTTGGTTG 
TTTCTTACAG 
TTTCTTTGTT 
TTCGTAAAAG 
GAAGTCAAGG 
GAATGGTTTG 
GTTTGGACCG 
GAAGATTGTG 
GGAGCCAATT 



152 
AGGGCTTCAG 



TGACTAGATC 
TGCGTTTGTT 
AGTGGTTGAT 
TCATATCGGG 
TAACGCTCGC 
TGTTTAAAAA 
GGAGTTTGAT 
CTTTTTCATC 
AGATTCTTCA 
AGTTAATTTG 
TATCAACGAA 
GTAGATTTTG 
CTCAATGTCA 
TTCATTTATG 
TGGGCACTAG 
ATCAATCCTC 
TTAGTCGGTG 
TTCCTGCTTT 
GCAACAGCCA 
ATGTTCAATG 
ATTAGCTTTA 
GAAAAGGATG 
ATTTCCCTTA 
CTAAAGTAAG 
ATGAAAATCA 
GAATCCCAGA 
CCTATCAGGA 
CAGAGATTGA 
GCTTGCTACT 



TATGGCGGAT 
CGATTCGTCC 
GCGACCAGTG 
TTTTATCAGC 
TCAAGGTATT 
CTATCTGATT 
TCTTTGGGGT 
TCCCTTGGCA 
CTTGATTTAT 
GGCACTCCTT 
GAAACGGGTC 
TGCATCTGAT 
TGGTTGGTGT 
TCTTTCAACA 
GATTTTCCTT 
GGCAACGCCT 
TCTTTCACAT 
GTATTTTATT 
TCCTAGTTTG 
GTATCGCCTT 
ACTTTGCTAA 
TCGTGCCTTT 
TGTTCTTTAA 
AACTTTGGGA 
ACTAAAATCA 
AAAAAAGGCA 
AAGCACACAA 
GAGTGATTTG 
TTGTCTCGGC 
TTAGAGAGTG 



ATCACCCTCT 
TTTATGATTG 
CATTTTGCGG 
GTTGGCCTTC 
GTAGAGGTGC 
AACTTTTTCT 
TCCAACCTAC 
TTTTTTCCAA 
ACTCCAGTTA 
TTGCAGTTCT 
CAGTCCTTTA 
TTTTATCAGA 
CTTGGGAGTC 
TATTCCATTC 
GATTCCCAAG 
AGTCCGAAAA 
CCTAGTTGAA 
GGGAACAACA 
TATTCCTTTT 
TTGGACTAAG 
GTATCCGATT 
CGCCTTTACA 
CGTAGGAGGT 
TAAGGGCTTA 
AGAAAGAAAC 
GTTGTCGCTG 
GCCTATATAG 
ACTAGATTTG 
GTAAAAAAGC 
AAGCTCGTAA 



3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 
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AAAAGTTGGT TATCTGCAGG TCAAAACAGT GGCAGAAGGT TCTAATAAAG ATTATGATCG 
AACAAATGAC TTTTATCGAG GTCTTGGCTT TAAAAAGTTA GAGATTTTTC CTCAACTATG 
GAATCCGCAA AATCCTTGTC AGATTTTGAT TAAAAAGCTT GAATAATATT ACTTGACATC 
TATTCTCAGA GTGCTATACT GTAAGTGTAA TCGCCGATTT AGCTTAGTTG GTAGAGCAAG 
GCACTCGTAA AGCCTAGGTT ATAGGTAGAT AAACGACTGA GGATTTGAAA AAATAGATAG 
GTAGAAGATA ACCGTTAAGC CTTACTCTTA GCGGTTATTT ATATTGTTTA ATAGCGCTAA 
TATTTTATCA ATTATGCCTG TTTTCGTGTT TCTGGTAGTT GTTCAAGTTT ATTGCTACTA 
TTTTTGATGG TATGAATGTG CTTATAATGT ATCCCGGTTA ACGAAAGTTT TGGACTTATA 
CTCTTCGAAA ATCTCTTCAA ACCACGTCAA CGTCGCCTTG CCGTGCGTAT GGTTATGACT 
TCGTCAGTTC TATCCACAAC CTCAAAACAG TGTTTTGAGT GACTACGTCA GTTCCATCTA 
CAACCTCAAA ACACTGTTTT GCCCAATCTG CGGCTAGTTT CCTAG 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7571 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
CTCTCCAGCT TTCCTTGCGA GTTGGCCATG TTGTGTCTTT AAGAAGTCTA AAAATATCTC 
CAATAAAACG CATCGCTCTC TCCTATCTCG TTTCTCTGTG TGTAGTGTAC TTGCCACAAT 
GCTTACAAAA TTTATTTACT TCTAGTCGTG TAGGCTTGAG GTTTCCGCTG ATCTTGATTG 
AATAGTTTCT CGAACCACAA ACCGCACAAG CTAGGCTTGC TTTTTTTAGT GCCATAACGC 
CTCCATCTTA TCCATTATAA CAAGAAAGCT AGGCTTTGAC AAGCATCTTA GCGAAATAGA 
TTGACTATCG AATCCCATAT TGTTTGAGCC TTTTCCTTAA TCTTCGCATC TGAGATAGCC 
CGGCTAGCCT CATCTACTAG ACTTTGCGCA CGCCCTCGAA TATCAGACAA ATTATCATCT 
GTCTGGCTAT TATCATTGGT TTGTACTTGT CTTTTTGTAT TGGCTGGTGC AATTCCATTT 
TGCTTATAAG CATTTTCAAC CGTAAAGGTA CTTCCTGGCG TATAAGGTAA AATGGTATTG - 
GCAATGTTTC TAAAGACATG AGCTGCACCG TTTGAAGTAG AGCCAGCTAG ATAGTGGTTT 
TCATCAGTGG TCGGAAAGCC AAGCCAGTGG CTAATCACTA CATCCGGAGT ATAACCAATT 
ACCCACTGGT CACTTGTGTA CTCCGGATTG AAAACTGCTT CAGTTGTTCC AGTTTTCCCT 



5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5625 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
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GCCATGACAT AGTCTGCAGG CGATGAACTA ATACCGGTAC 
ATCATACTGG TCATCTTGTC AGCTACAGAC TTATCAATCA 
TGACTCGCAA TAACTTGTCC ACTAGCATTT TCAATTCTAC 
ATTAAACCTT CATTTGCAAA GGCGGCGTAT GCTTGAGCCA 
ACACCGCTTC CCAAGGCGAC ACCAAGAACA CGGTCGACCT 
TTTTCGCCTG CCTCAAAAGC CTTGTCGACA CCCAAATCAT 
AGATTAAGCG ATTCTGCCAA GGCTTGATAC ATAGGAACTT 
GCATAGTTAT CAACCTTATA GCTGTCATAC TGCATGGTAT 
AAAGCCCAGC TTGCTTCAAC TGCTGGCGTA TAAACAACTA 
GGACTACGCT TTGATTGGGT TGCATAGTTG AAATTCCGGA 
GCAACTTGAC CGACAACTCC ACGAACTCCC CCTGTTTTCG 
GATTGAGCAA ACGTTCCATC CTCTGCCCTC GGAAATAGCG 
TGCATATTTG CTTGGTAGTT TTGGTCCAGC TCTGTGTAAA 
ATCTCTTCCT CTGTTAGATT ATACTTGGAA ACAGCTTCAT 
GAGGGGTAAC GGTAATCTGA GATTTTTCCT TCATACTTAT 
TCAACTTCAG CAG CTTTGGT TTCTTGGTTT TTATCAATAT 
TGCAAGACAG TATCGCGCCG ATTAGTAGAA TCTTCTACGG 
TCCGGCCCCT TGAGCATCCC TGCCAGAGTC GCAGCTTGAT 
GAAACTCCAA AGTATTTCTT ACTCGCATCT TCTACACCCC 
GCGTTGTTAA GGTACATGGT TAGAATTTGC TCCTTACTAT 
GCAAGGAAAA ATTCTTTCGC TTTTCTCTCA ACAGTTTGAT 
TTAGCCAGCT GTTGGGTAAT GGTAGAGCCA CCACCTGAAC 
AAGAAAAAAC GGCCATAGTT AATCCCGTCA TTTTTATAGA 
ATAACAGCAT TCTGCAAGTT TTTACTGATG TCAGTCAGCT 
CCAGACAAGG CACCAGCCTC TTTTTCTTCA CGGTCAAAAA 
GCATTTTGCA AATCATTGAC ATTGGTCGAC TTGGCTACAG 
AGCAAGCCTG CACTCAAACC TAGTATAAGG ATAATCTTTG 
AATTTTCGAA TCGGACCTAC TTGGGCTAAT TTTTTTCGAT 
ATAGTAGAAT CAGAGTCCTC TAGTTCACTT GTTTCTTTTT 
TCAAATAATT TATCTAATTT CATGCGTTTA TTTTATCATC 



CGTTGGTGAA AGTCCCCAAC 
CCCGTTTTTG TGAATTTTTA 
TAATAAAATG AGCTTCAGGC 
TTTGAAGAGG GTTGGTTTCA 
TTTCCATGTT GAGTCCGAAT 
TAACAGTGGC AACAGCAGGT 
CTCGACTCGT TTTGATCCCT 
GGTTATCCAA CTGCTTATTC 
AAGGCTTAAT TGTAGAACCA 
ATCCAGTTTT ATCATTGTCA 
GTTCGAGGGC TACACTTCCT 
ATGTGTTTTC ATAAACAATC 
TGCGGTAGCC ATTATTGACA 
TAACCACCGC ATCAAAATAA 
CGTGCAATTG CGAAGTCATA 
ATCCTGCTGC AACCATATTC 
AATTCAAGGG ATTATACAGT 
CCAGACTCAC TTCTGATGCA 
ACACACCATT TCCAAAATAA 
ATTTTTTGCT TAATTCTAAG 
CCTGCGATAA ATAGGCGTTT 
GTCCAGCAGT GACAATAGCC 
AAGAACGGTC TTCTGTCGCA 
CAACATAGGT TCCCTTTTGA 
TAAGAGTCCG AGTTTTCAAG 
CAAACAAATA GATTCCAACT 
TTAGATGATA ACGACGCCAG 
CACTACGAGA GCGACGTAAG 
TAAAAAGAGA AAGAAATTTC 
TTCATCATAG GAAGACAAGA 



780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
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ATTTAGCTAT 
ATTTACATTA 
CCTCATCCCT 
AGAAGAAGTC 
CGAGGAAGAT 
TTATCAAGAT 
TCAACCAAAC 
TGTCGTTCAT 
TATCCTGCCC 
TCTAGTTGAT 
TCGCCATGAT 
TGTAAGCAGA 
AGGGCGAACC 
CCCTCTCTAT 
CTTTACCCAC 
AAAAGAATTA 
CCACAAAGCC 
AGAGTACGAA 
ACCAATTGTT 
TAAGACATAC 
TTTGAAGCTG 
GCTACCAATT 
TTACCATTCA 
ACGATGTTTG 
AGGATCATTT 
AAGTTGTCTT 
GAGATAACTG 
ACGTCGTTTC 
TCAGCTGCTT 



TTCCTATCCA 
CCCGCCTCTC 
AGAAAAATCC 
CACTGGAAGG 
TATTCCCAAA 
CAACACTTGA 
GAAATTGCCC 
CGTCTGGACA 
ATTCTCAATC 
GGAAATATCA 
CGTAGAAAAA 
TTAAAGCAAT 
CATCAGATTC 
AATAGTAAAT 
CCACTTACTT 
AAAAAGAATG 
TTGCTTTCTA 
CAAGTTGTGC 
TACCGTCAAC 
CTACGATATC 
CTTTCATAGC 
CAGTAACTGA 
ATTCTGGGAT 
CAGCACCAGC 
GGTCACCAGT 
GAAGAGCTTT 
TTTCAGTACC 
CACCAGGAGC 
CTTTCTTAGC 



AATAGGGCTT 
TACCTCAAAT 
GTCATTTTTT 
AAATCGTAAA 
AGACGATCCC 
TTATTGTAAA 
TTCTTAACCA 
TGGAAACCAG 
GCTTATTGGA 
ACAGAAAAGA 
GAATAGTTGA 
TCTCAAACAA 
GTGTGCACCT 
CAAAGACAAG 
TAGAGAAGCT 
GATGATCGTG 
TCAACTCAAG 
AGTGTATGAC 
GTCAAGAACT 
TGAAGATACG 
TGCGTTCACT 
TCCAGTTGGA 
TACAAGACCG 
GCGAGCACGG 
GTAAGCGTGG 
AGCCATTGGA 
GTCAAGAACG 
AGTGATAACA 
AAAGAAACCA 



155 
TTTTTGTTAC 
GACAGTAAAG 
GAGAATCAAG 
TCCTGGAGAT 
TTGGGGCAAC 
CAAACCAGAG 
TGTCAGTACC 
TGGCTTAGTT 
GAAAAAAGAG 
ACTTGTTTTC 
TGCAAAAAAT 
GACTTCCTTG 
TTCGCATCAT 
CCGGCTTATG 
AACTTTCACT 
TCATCCATTT 
AATTATTTAG 
ATTTCGTTGT 
TTAGTTTGAG 
ATTGGATCTT 
TCATCAACAG 
GTTGGAACGC 
ATAGCTTTTG 
CGAAGGTCAC 
ATAGTAGTCA 
GCCAAGCAGT 
TCGTGGTTAG 
ACTTTTTTAG 
GTAGCTTCAA 



AATATCTGTA 
CAATTACTTG 
AAACATATTT 
GTTTGCCAGT 
CCAGACTTAG 
GGGATGAAAA 
TATGTTGGCC 
CTCTTTGCCA 
ATTTCTAGAG 
AGAGACAAAA 
GGGCAATATG 
GCTCATTGCA 
AATCTTCCTA 
CTTCATGCCT 
ACCCTTTCAA 
TTCCATATAA 
CAATTTTTGC 
CGTACCATGA 
TTGCGTCAAA 
CTGTGTAACC 
TAACGTTCTT 
GTTGTGCAGA 
CAGCACCAGT 
CACCACGGTG 
TCAATCCTTC 
TTGTAGTACA 
TGTTGAATAC 
CTCCACCTTT 
GAACGATTTC 



TGCAATTCAC 
AGGAACAACT 
TGATAAATCA 
TGACTTTTGA 
TGCAGGAAGT 
CGCATGGTAA 
AAACCTGCTA 
AAAATCCTTT 
AATATTGGGC 
TTGGACGTGA 
CTGAAACGCA 
AGCTAAAGAC 
TCCTGGGAGA 
TCCGACTTTC 
ATACATTTGA 
AAAAGCAAGA 
GAAGTATTCA 
TACAACTTTA 
CAATGAACCG 
GTATGATTCG 
TTCAAGAACT 
TCCGTCAAGT 
TGAGTTAGGA 
TGGTCCGTCA 
AACAACACCA 
TGAAGCACCT - 
AACTGTTTTA 
AAGGTGTTTT 
TACACCGTCA 



2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 
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GTAGCCCAGT 


CGATTTGTTC TGGATCACGT TCAGCAGAAA CTTTGATGAA TTTACCGTTA 


4320 


ACTTCAAATC 


CACCTTCTTT AACTTCAACA GTACCGTCGA AACGACCTTG AGTTGTGTCG 


4380 


TATTTCAACA 


AGTGTGCAAG CATAACTGGA TCTGTAAGGT CGTTGATGCG TGTAACTTCA 


4440 


ACACCTTCTA 


CGTTTTGGAT ACGACGGAAA GCAAGACGAC CGATACGTCC GAAACCGTTA 


4500 


ATACCAACTT 


TAACTACCAT TAGTGATTTC CTCCTTATGA AAATCATGAA ATTTTTATTG 


4560 


TGAAAAGAGT 


AACTTGAATC ACTACAAATC ACCTTTCAAC AAACCTATTA TACAACTATT 


4620 


TGAGTTGAAT 


TGCAAGTATG GCCATTGTTT TTCTATGTTA GTTTCTTTTT AAGACTGTAA 


4680 


ACCAAGGAAT 


CCCTTACTAT TCATAGCATA ACGATTCTAT AGGATCCATT TTACTAATCT 


4740 


TACGCGCCGG 


GAAGTAGGCT GAGACATAAC CAAGTAATAG AGCGAAAACT AGAGTTCCTA 


4800 


AAACAGATAA AAGATTTAAT TTAAAAACCT TAGTGATGGA TGGGTAAAAG TGACTTACAA 


4860 


TCGCATTCGC 


CAAACTTCCC ACCCCTTGTG CAACCAAAAA TGCCAGCAGC AAGGCGATGC 


4920 


CTACAATCCA GATAGCCTCG TAAATAAAAA TTCCTTTGAC ATCACGATTC TGATAACCAA 


4980 


CTGCTTTCAT 


GACACCTATT TCCTTGGAAC GTTGCATGAT ATTGATGTAA ATAATGATAC 


5040 


CAATCATAAC 


CGCTGCTACC ACAATAGCTT GTGATGAAAG CACAATCAAT AATCCCTGAA 


5100 


TAACACGAAT 


AAAGGTAATC ACAATATCAA GAACTCTCTG TTGAGAAAGC ACAGTATACT 


5160 


TCTTATTTTT 


CTGTAATTCT TCTGTTACTA CTTTTGTCTG TGATGGATCT TTGAGTTCCA 


5220 


AGATAAAATA 


AGATACAGCT TTCGTAAATC CAGCCTCTTT CAAAATCGTT TCCATTTGAT 


5280 


GAGACAGCAT 


GAAACTGTTG CTGTCCTCCA TGTCATCTTC ATCATTGATT ACACGTACAA 


5340 


TCTTCGTTTG 


AAATTGAGCA ATCTTACTAG TTTCGGCAGC ACTTTCTACA ATGCTGGCTG 


5400 


AGACTGATTT 


GCCAATAAGA TCATTAGCTG TCAAATTTTT TCCTGTCTGT TCATTCCAAT 


5460 


TTTTTAGTAA ACTGCTTGGA ATCGTTAATC CCTGTTCATT TGTATCAGTA TAGAGGGATC 


5520 


CAGCCAACAC 


TTTGTCCGTC TCATTATTAC TAACAGAGAT ACTTGTATCA TCATAAAGAC 


5580 


TCACTACTTG AGCATAAGAA GGCATCGTTT GACTCAGATC CATTTCTTGC C CATC TAT AG 


5640 


TAATATTTGA 


CATGTTCATC CCAAAAGGAC TCTCCAAATA TTTAATAGCT TCTTTCCCAA 


5700 


CTGTATCCGT 


GATATATAGT CAATTGAAAC AAGAGCAGGA TAAAAAAGCC TCGTAAAAGG 


5760 


TATTGCAACT 


TGGTAATACC TTTTTGAGGT GCTTTTTGAT ATGAGCCCAT GTTTTCTCAA 


5820 


TAGGATTGTA CTCAGGCGAG TAGGGAGGAA GAGGTAAAAG TTTATGCCCA AACTCTTCGC 


5880 


ATAAAAGTTC 


TAGCTTCCCC ATTCTATGGA ATCTTACATT ATCCATAATA ATAACCGATG 


5940 


GTGTGTTTAA TGTTGGTAAG AGAAAATTCT GAAACCAAGC TTCAAAAAAG TCGCTCGTCA 


6000 


TCGTCTCTTC i 


GTAAGTCATT GGAGCGATTA ATTCACCATT TGTTAGACCT GCAACCAAAG 


6060 
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AAATCCTCTG ATATCTTCTT CCAGATACTT TGCCTCTTAT 
GACCATATTC TCGATAAAAA TAAGTATCGA ATCCTGTTTC 
GGTGCTTTAA ACTATTAAAA TTCTTAAGAA ATAAGGCTAC 
AGTAGGTGTG GTTCTTTTTT CGAGTGTAGC CCATAGCTTT 
TTGGATGACA GCCAAATTCA GAAGCTATTT CAGTCAAATA 
GATAGTTTTT AAGTCTATCT CTATCAACCT TTCTTGGTTT 
TTAGCTCTCC TGTTTTCTCT TTTAGCTTTA ACCAGCCATA 
GGAAAACGTG TGATGCTTCT GTTATACTAC CTGTTCGCTC 
TACGAAAATC TATTGAATAT GCCATAAAAA GATTATACCA 
TCATTTTACT ATATTTGAAG AGGCGTTTAA ACTATCTGAC 
AAGACATCCT TTAAAAAGTT AGTTTATTTT ACAACTTAGA 
TTCATGGAAA AATCAAGACT CTTAGCACTA TGGGTTAAAC 
ATCGCTAAAC CACGAAAACG GCTAATAGTG GTCATATCAA 
CGAGAACGTC CTGCAATTAG GGTAATGGCC TGTTCAATCT 
AACATGATAA TATCAGCACC CGCCGCCGCA GCTTCTTCGG 
TCCACCTCGA CCATTTTCAC AAAAGGGGCA TAGGCACGCG 
ACACTACCTA CTGCCGCAAT GTGATTGTCT TTTAGCAGGA 
CGATGATTAT AGCCACCGCC AACTCTCACG GCATATTTCT 
GTAGTTTTTC GAGTATCAAA TACCTTAATG CAATCATCGC 
GCTGTCATCG AAGCAATCCC TGATAAATGT TGTAAAAAAT 
GTTAAGAGAC TTCTCACCGA GCCTATGATT TCTAAAACCA 
TCCCCATCCT TAAATTGATG AGGATTCTGG AAGGTCACCT 
ACCCTTTGAA AAACGGTTAG CCCCGCTAAA ACACCAGCTT 
TTGGCTTGGC CATGATGATC AAAAATGGCA TTGGTACTGT 
TCTCGCAAGG CTGCTTTCAA TGTATCATCT ATTTGAAAAG 
ATTGACATCA C 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26385 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 



TAATTGACCT 
GTCAATCTAA 
TTTTTCTGGG 
GAGCGTATAG 
AGCGTCTGGA 
TATTCCTTTT 
AATGGTATTA 
ACAATAAGAG 
CATTGTGTAC 
ATAAAACTCG 
CATCAAGGTA 
TACCACTGGA 
TATTTCCAGA 
GTTCCAATGA 
CAGCAGCAAG 
CTTGAGCAAT 
TAGCATCTGA 
CAAAAAGACG 
CTAAGGCTTC 
TCAAGGCAAC 
AATCGCCACT 
CGGCATCAAA 
CCTTGGCAAA 
AATCTTCGGA 
GGGTTAAATC 



TTTAATGAGC 
ACAGGTGCTA 
TCTTGTTCAT 
TGGATGGTAG 
TTGTCAGTAA 

ACTTGGTGGT 
CGTGAGATTT 
AGAACTTTTT 
TATTTTTGGT 
TTCTAGAGGA 
GGTTAACCCC 
GACGTAATCA 
ACATTCAATC 
CATATTATCC 
GCTTTCCACT 
TGCCTTTTGA 
TAAATTAAAG 
TAAATTAGGA 
TACATAAGCA 
GCGTTCACAT 
AGTCAAACGA 
TAGGGTAAAA 
AAGCGACACC 
ATGAACATCT 
AGTTGAAATG 



6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7560 

7571 



J 
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<D> TOPOLOGY; linear 



(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


3: 






TTTGCTAGTG 


GCTTAAATTC 


TTCAGGAAAA 


TCAGGCGTAT 


CTAAAAGTCG 


TGTCGTTTTT 


60 


GTTTCATCTA 


TATAAAGACT 


TCCTGCTCCC 


CCTACAACTA 


GAAAACGTGT 


CTGTGTTCCA 


120 


GCAAGAAGCT 


GATTAAATAG 


TTCGATTGAT 


TTGCTGTGGA 


GCGGTAGCGT 


ATCTGGTGTA 


180 


TAAGCACCAA 


ACGCTGAAAT 


AACAGCATCA 


AATCCAGTAA 


GATCATCTTT 


TGTCAACTCA 


240 


AATAAATCTT 


TTTTAATAAT 


AGACTCAGCT 


TGACTTTTGT 


TTTCAGAACG 


AACAATAGCC 


300 


GTTACTTCAT 


GTCCTCGTTT 


GACTGCTTCT 


TCAACAATTG 


CTTTCCCCGC 


TTGTCCATTT 


360 


GCTGCAATAA 


CTGCTAGTTT 


CATTTTTTAT 


ACCTCTCTTG 


TTGTAATTAT 


TTTAGTTACA 


420 


GAAATTGTGA 


CACTCTTAAT 


AATCAATGTC 


AATAGTCTTG 


CTTAATTATT 


ATCAAAATAT 


480 


TTCTACCAAG 


AAAACTAACC 


ATGATTCTAG 


TGAAAAAAAA 


TCTTCTTTGT 


CAACAAATTT 


540 


ACTTTCTTGT 


TTTAAACATG 


CTATAATAAT 


CATAGCAAGA 


GATCTAAGTT 


GTCTGTTTTT 


600 


TTAAAACGAG 


GTGATTATCA 


TGCGTAGATT 


CTATTCCCAT 


CTCCCCTACT ATCTGGTCAT 


660 


ATTATTCTTT 


TATTGGCCAC 


TTTATGAGTT 


GTTCTTACTA 


GTTGTTTCTG 


ACCCCCTTAC 


720 


ACTCAAGGGA 


CTCTATATAA 


ACAATCTTCT 


CTTCTTTACA 


CCTCTGGTAA 


TCTTGATTGT 


780 


ATCGTTACTC 


TATAGCTACC 


GTTTCCGTTT 


CTCACTTTGA 


TGGTTAGTTG 


GTAACGGACT 


840 


GCTCTTTTAC 


TTTACTATCA 


TAACCTTTGG 


TGAGTTTATA 


CTAATTTACT 


TGCTAATCTA 


900 


TGAAACAGTT 


GCTCTGGTCG 


GCATGGATTC 


TGGTATTAGC 


ATCAAGCATA 


TTCTACAAAA 


960 


AATGAAAAAC 


AAAAAACTTT 


CACAAAATCC 


TTGAAAAATC 


TCACAATCAT 


GCTATAATAA 


lUZU 


TCCATAGAGA 


CAAGTCACTT 


AGTCCCTTTC 


TACTAGAGAG 


TGCGTGGTTG 


CTGGAAACGC 


1080 


ATAGGAAGTC 


TAAACTGATA 


CTACTCTTGA 


GTTTTTTATG 


AAAACATAAA 


ACGGTGGCCA 


1140 


CGTTAGAGCC 


GATCAGAGGT 


GTCCCTCTCT 


TTTGAGGTAC 


ATAAATGAAG 


GTGGAACCAC 


1200 


GTTGCGACGT 


CCTTTCGAGG 


ATGTCGCATT 


TTTTTATTAG 


GATACTAATT 


ATGGAGTTGC 


1260 


AAGAATTAGT 


GGAGCGCAGT 


TGGGCAATCC 


GACAAGCTTA 


TCACGAACTG 


GAAGTTAAGC 


1320 


ATCATGATTC 


CAAGTGGACG 


GTAGAAGAAG 


ACCTCTTGGC 


TTTATCTAAT 


GATATTGGAA 


1380 


ATTTCCAACG 


ACTGGTGATG 


ACAAAGCAAG 


GACGCTACTA 


TGATGAAACA 


CCCTACACAC 


1440 


TGGAACAAAA ACTTTCAGAA 


AATATCTGGT 


GGCTATTAGA 


ACTTTCTCAA 


CGTTTCGATA 


1500 


TAGACATTCT GACGGAAATG 


GAAAACTTCC 


TCTCTGATAA 


AGAAAAGCAA 


TTGAACGTTA 


1560 


GGACTTGGAA GTAGTCTGCT 


GATAAAAAAT 


CAATGCTTAG 


AAACTATGAA 


ATAATAAAAA 


1620 
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AGGAGAACAT 


CATGATTAAC ATTACTTTCC 


CAGATGGCGC TGTTCGTGAA 


TTCGAATCTG 


1680 


GCGTAACAAC 


TTTTGAAATT GCCCAATCTA 


TCAGCAATTC CCTAGCTAAA 


AAAGCCTTGG 


1740 


CTGGTAAATT 


CAACGGCAAA CTCATCGACA 


CTACTCGCGC TATCACTGAA 


GATGGAAGCA 


1800 


TCGAAATTGT 


GACACCTGAT CACGAAGATG 


CCCTTCCAAT CTTGCGTCAC 


TCAGCAGCTC 


1860 


ACTTGTTCGC 


CCAAGCAGCT CGTCGTCTTT 


TCCCAGACAT TCACTTGGGA 


GTTGGTCCAG 


1920 


CCATCGAAGA 


TGGTTTCTAC TACGATACTG 


ACAACACAGC TGGTCAAATC 


TCTAACGAAG 


1980 


ACCTTCCTCG 


TATCGAAGAA GAAATGCAAA 


AAATCGTCAA AGAAAACTTC 


CCATCTATTC 


2040 


GTGAAGAAGT 


GACTAAAGAC GAGGCACGTG 


AAATCTTCAA AAATGACCCT 


TACAAGTTGG 


2100 


AATTGATTGA AGAACACTCA GAAGACGAAG 


GCGGTTTGAC TATCTATCGT 


CAGGGTGAAT 


2160 


ATGTAGACCT 


CTGCCGTGGA CCTCACGTTC CATCAACAGG TCGTATCCAA ATCTTCCACC 


2220 


TTCTCCATGT 


AGCTGGTGCG TACTGGCGTG 


GAAACAGCGA CAACGCTATG 


ATGCAACGTA 


2280 


TCTACGGTAC 


AGCTTGGTTT GACAAGAAAG 


ACTTGAAAAA CTACCTTCAA 


ATGCGTGAAG 


2340 


AAGCTAAGGA 


ACGTGACCAC CGTAAACTTG 


GTAAAGAGCT TGACCTCTTT 


ATGATTTCAC 


2400 


AAGAAGTGGG 


ACAAGGTTTG CCATTCTGGT 


TGCCAAATGG TGCGACTATC 


CGTCGTGAAT 


2460 


TGGAACGCTA 


CATCGTAAAC AAAGAGTTGG 


TTTCTGGCTA CCAACACGTC 


TACACTCCAC 


2520 


CACTTGCTTC 


TGTTGAGCTT TACAAGACTT 


CTGGTCACTG GGATCATTAC 


CAAGAAGACA 


2580 


TGTTCCCAAC 


CATGGACATG GGTGACGGGG 


AAGAATTTGT CCTTCGTCCA 


ATGAACTGTC 


2640 


CGCACCACAT 


CCAAGTTTTC AAACACCATG 


TTCACTCTTA CCGTGAATTG 


CCAATCCGTA 


2700 


TCGCTGAAAT 


CGGTATGATG CACCGTTACG 


AAAAATCTGG TGCCCTCACT 


GGCCTTCAAC 


2760 


GTGTACGTGA 


AATGTCACTC AACGACGGTC 


ACCTATTCGT TACTCCAGAA 


CAAATCCAAG 


2820 


AAGAATTCCA 


ACGTGCCCTT CAGTTGATTA 


TCGATGTTTA TGAAGACTTC AACTTGACTG 


2880 


ACTACCGCTT 


CCGCCTCTCT CTTCGTGACC 


CTCAAGATAC TCATAAGTAC 


TTTGATAACG 


2940 


ATGAGATGTG 


GGAAAATGCC CAAACCATGC 


TTCGTGCAGC TCTTGATGAA 


ATGGGCGTGG 


3000 


ACTACTTTGA AGCCGAAGGT GAAGCAGCCT 


TCTACGGACC AAAATTGGAT 


ATCCAGATTA 


3060 


AAACTGCCCT 


TGGAAAAGAA GAAACCCTTT 


CTACTATCCA ACTTGATTTC 


TTGTTGCCAG 


3120 


AACGCTTCGA 


CCTCAAATAC ATCGGAGCTG 


ATGGCGAAGA TCACCGTCCA 


GTCATGATCC - 


3180 


ACCGTGGGGT 


TATCTCAACT ATGGAACGCT 


TCACAGCTAT CTTGATTGAG 


AACTACAAGG 


3240 


GGGCCTTCCC 


AACATGGCTG GCACCACACC 


AAGTAACCCT CATCCCAGTA 


TCTAACGAAA 


3300 


AACACGTGGA 


CTACGCTTGG GAAGTGGCCA 


AGAAACTCCG TGACCGCGGT 


GTCCGTGCAG 


3360 
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ACGTAGATGA GCGCAATGAA AAAATGCAGT TCAAGATCCG TGCTTCACAA ACCAGCAAGA 3420 

TTCCTTACCA ATTAATTGTT GGAGACAAAG AAATGGAAGA CGAAACAGTC AACGTTCGTC 3480 

GCTACGGCCA AAAAGAAACA CAAACTGTCT CAGTTGATAA TTTTGTTCAA GCTATCCTAG 3540 

CTGATATCGC CAACAAATCA CGCGTTGAGA AATAAGAGTC TAGCATAAAA GCCTCCAATC 3600 

TGGAGGCTTT TTCTCATCTA TTTTTACTCA AGGACTAAGT TCACTTGAGC AAACTGAATC 3660 

CGCACTGTCG TTCCTTTTCC GACCTCAGAC TCGATACGAA TCTGGTGCCC CAGTTCTTCA 3720 

GAAATTTTCT TAGATAGATA AAGGCCAAGT CCAGAGGACT GCTGGGTCAA ACGGCCATTG 3780 

TATCCTGAAA AGCCACGTTC AAATACTCGG AGGACATCAC TGTTTTTTAT CCCGATTCCC 3840 

GTATCTTTGA TACAAAGCTC TTGGTCATCC ATATAAATCT CCAGACCACC TTCCTTGGTG 3900 

TACTTGAGAC TGTTTGAGAT GATTTGCTCA ATAACCACTA GCAGCCACTT TTTATCCGTC 3960 

ACGATTTCTT TATCAAGGTC ATGTAGATTG ACATTTAAGC CTTTTTGAAT AAAGAAAAGA 4020 

GCATATTTAC GAATTATTTC CTTGACCAAG TCCTCAATTT GAACCTGCTT TAAGACCAAA 4080 

TCATCATGGA AACTTTCTAA ACGCAGGTAC TGTAAAACTA GGTTGGTATA GGAGTCGATT 4140 

TTGAAAATTT CCTGTTCTAG CTGCTGCTTC AGTTGGCGGT CGACCACTTC TGCAACTAAG 4200 

AGTTGACTGG CTGCAATGGG GGTCTTTATC TGATGGACCC ACAAGGTATA GTAATCCAGC 4260 

AAATCCGTCA GTTTTCTTTC TGCTTTTGAC CTCTGCTGAT AGAGTTCCAT CTCACGCGCT 4320 

TCTAATTTTT CTGCTAAAGC TATTTCCAAA GGAGACTTGG CTTCCCTCTC TCCATAGAGA 4380 

AGTTCCTGGC GATAGACCTG CGTTTCCACC AATATGTCCC AAGTGAAAAA TAATATGGTT 4440 

ACAAAGCAAC ACAAGAAGAA AAAGTAGAGG AAGTAAATTC CTAGACTGGC AAATAAAAAC 4500 

TGAAAGAGTA AGACAAGAAA TGCCAAAGAA AGCAGATAGA TAAAAAGACG ACTACGGGAG 4560 

CGCAGATAGG CTAGAAAAAA TTGTTTCCAA TCAAGCATGC TTCAATCCGT ACCCTATTCC 4620 

TTTCTTGGTC TCGATAAATC CTACCAATCC CTGCTCCTCC AACTTTTTAC GCAAACGAGC 4680 

CACATTGACA GAGAGGGTAT TAT CATC AAT GAAAAAGTCA CTGTTCCAAA GTTCCCGCAT 4740 

CAGGTCGTCA CGTGCTACGA TGTTGCCTGC ATGCTCAAAT AACACGCGTA AAATCTGGAA 4800 

TTCATTCTTG GTCAAATTCA AGACTTGCCC TTGATAATGT AAATCCATGG ATTTGGTATT 4860 

GAGGATAACA CCAGCATATT CCAGCAAACT CTCATCACGC CCAAACTCAT AGGAACGACG 4920 

CAACAAGCCC TGAACCTTAG CTAAAAGAAC CTGCTGGTCA AAAGGCTTGG TCACAAAGTC 4980 

ATCCGCCCCC ATATTGATTG CCATGACAAT ATCCATAGCC TGGTCTCTCG AAGAAAGAAA 5040 

CATGATAGGT ACCTTGGAAA TCTTGCGGAT TTCCTGACAC CAGTGATAAC CATTAAACAA 5100 

GGGCAAACCA ATATCCATGA GGACCAGATG AGGTTOCGAC TGAACAAATA GACTCAAAAC 5160 
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TTCCATAAAG TCTTCTACCA GGACCACTTC AAATCCCCAT 


TCAGAGAGCA 


TTTTCCCAAT 


5220 


CTGTTGACGA ATGACCTGAT CATCTTCTAT TAATAAAATC 


TTGTGCATGC 


GCTTCTCCTT 


5280 


TTCCATTATT ATAACAGATT TTTCCATGCT AGATGGTCTG 


AAACTGAATT 


TGAAATAGCC 


5340 


TGTTTTTAGC CAGTACAAAC AGGCTATGCT ACTAGCTAAT 


TTGAGGGAAA 


TTTGCTAAGA 


5400 


TAAATAAAAA GAAAGGAGCT CTTATGGCCA ATATTTTTGA 


CTATCTGAAA 


GATGTCGCAT 


5460 


ATGATTCTTA TTACGACCTT CCCTTGAATG AGTTAGACAT 


TCTAACCTTA 


ATAGAAATCA 


5520 


CCTACCTCTC CTTTGATAAT CTGGTCTCCA CACTTCCTCA 


ACGTCTTTTA 


GATCTAGCAC 


5580 


CTCAGGTTCC AAGAGATCCC ACCATGCTTA CTAGCAAAAA 


TCGCCTTCAA 


TTATTAGATG 


5640 


AATTGGCTCA ACACAAGCGC TTCAAAAATT GCAAACTCTC 


CCATTTTATC 


AACGACATCG 


.5700 


ACCCTGAACT GCAAAAGCAA TTTGCGGCTA TGACTTATCG TGTCAGCCTC 


GATACCTATC 


57 60 


TGATTGTCTT TCGTGGGACA GATGACAGTA TCATTGGCTG 


GAAGGAAGAT 


TTCCACCTGA 


■ 5820 


CCTATATGAA GGAAATTCCT GCTCAAAAGC ACGCCCTTCG 


CTATTTAAAG 


AACTTTTTTG 


5880 


CCCATCATCC TAAGCAAAAG GTTATTCTAG CTGGGCATTC 


CAAGGGAGGA 


AATCTCGCTA 


5940 


TCTATGCTGC TAGCCAAATT GAGCAAAGTT TGCAAAATCA 


GATCACAGCA 


GTTTATACAT 


6000 


TTGATGCACC TGGTCTCCAT CAAGAATTGA CACAGACTGC 


GGGTTATCAA 


AGGATAATGG 


6060 


ATAGAAGCAA GATATTCATT CCACAAGGTT CCATTATCGG 


TATGATGCTG 


GAAATTCCTG 


6120 


CTCACCAAAT CATCGTTCAG AGTACTGCCC TGGGTGGCAT 


CGCCCAGCAC 


GATACCTTTA 


6180 


GTTGGCAGAT TGAGGACAAG CACTTCGTCC AACTGGATAA 


GACCAACAGT 


GATAGCCAGC 


6240 


AAGTAGACAC AACCTTTAAA GAATGGGTGG CCACAGTCCC 


TGACGAAGAA 


CTTCAGCTCT 


6300 


ACTTCGACCT CTTCTTTGGC ACTATTCTTG ATGCTGGTAT 


TAGCTCTATC 


AATGACTTGG 


6360 


CTTCCTTAAA GGCGCTTGAA TACATTCATC ATCTCTTTGT 


CCAAGCTCAA 


TCCCTCACTC 


6420 


CAGAAGAAAG AGAAACCTTG GGTCGCCTTA CCCAGTTATT 


GATTGATACT 


CGTTACCAGG 


6480 


CATGGAAAAA TAGATAATAC TCTTGAAAAT TAAATGTATA 


CAAAACAAAA 


GACCTAGAAT 


6540 


ACATACTTTC ATGTGCATTC TAAGTCTTTT TAAATAGAAT 


CTAATAGTCA 


ATAAAAATCA 


6600 


AAGAGCATTG AGAGATAATG GGGCTTGGAA CGTCCCTCTC GCTTCAACAA 


AATGACCCCA 


6660 


TTATAGATTA AAAAGATGCC ACTTAGAAAA AGCAAAAAAG 


GAAGTAAGAC 


AAAGGCAAAT " 


6720 


ATATAAAAAG CTAACTGAAC ATTCTCGTAT CCATTTTTAT 


AAAAAAGGTA 


GGATAGATAA 


6780 


AAATAACTTG AAATGAGGGA TAATAAAAAT AATACTGGAT 


TCCACAAACT 


TCTATTATCC 


6840 


TTCCAAAATG ACACTATAAA GGCTAATACA ATTCCTATAA 


CGAGATACAT 


TTCTTACTCC 


6900 
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TTTAATAGCT 


ACATTTTATC 


ATAATTATCC 


AAAGAAAAAA 


GAGGGCATTT 


ATCCCTCTTA 


6960 


ATCCTTCATC 


TGACTCTCTG 


CATCGGCCAC 


GACTTTTTCT 


AGACTGGTTT 


GACCAAGTTC 


7020 


TGCCTCCATA 


GTCAACTGAA 


TTCTCTCCAA 


TTTTTGATCC 


AAAACATCAT 


GAATATGAGC 


7080 


TCCTACAGGG 


CAATTTGGAT 


TCGGATTGTC 


ATGGAAACTG 


AAGAGTTGAC 


CTGTCTTACC 


7140 


AAGACATTCG 


ACCGCCTGAT 


AAACATCTAA 


AAGACTAATA 


TCCTTAAGGT 


CCTTGACAAT 


7200 


CTCTGTTCCG 


CCCGTTCCAC 


GOGCTACTGA 


AATCAGCTCT 


GCCTTCTTCA 


ACTGGGACAA 


7260 


GATCTTTCTG 


ATAATGACAG 


GATTGACCCC 


GACACTAGCA 


GCCAGAAAAT 


CACTGGTCAC 


7320 


CTTGCTTTCC 


TTCCCCTCGA 


GGGCAATGAT 


TATCAGCATA . TGAGTCGCAA 


TGGTAAATCT 


7380 


ACTTGGAATT 


TGCATCCTCT 


TCTCCTTTTT 


ACGAGGCTAC 


CCTGCCTCTA 


CTCTTCTTTT 


7440 


TCTATTATTA 


TACCCTTTTT 


AGTTGTAATG 


TCAATCGTTA 


CCACTTTTCA 


ACCAGTCGTC 


7500 


TAACTCCCGA 


TCGCAGCCCT 


CTTTCTGAGC 


CAATTCTCTC 


AAAAATTCCT 


GATGATGAGT 


7560 


ATGGTGGATC 


CCATTGACCA 


GACTTTCATA 


GTAAACCTCA 


AAATAGGGAA 


GTCTCAGGTC 


7620 


TTTAGCCAGC 


TGCAATTCAG 


CTGCTACATC 


GTAGTCTACC 


CGTCGGAAGT 


CCATATCTAC 


7680 


CAGGCCTTTG 


TCATCAAACT 


CCAAAATCAT 


ATACTGGGCC 


CGCAAGTCCT 


TCCGTAGCTG 


7740 


AGCGTCCAAA 


AAGAAAGGTT 


GGCCAATCGA 


ACCCGGATTG 


ACAATCAATT 


GCCCACCAGT 


7800 


CCCGTAACGA 


AGCAACTGCT 


GGTGAATATG 


TCCATAAACA 


GCAATATCAC 


AGGGAGGATG 


7860 


AGTCACCAAG 


CGGTCAAACT 


CCTCTTGTTT 


GCCAGTATGA 


ATCAACTCTC 


GCCCCCAGTT 


7920 


CTTATCAGGC 


AGATGATGGC 


TAATTCCCAC 


CGTCAAATCC 


CCAAACTGAC 


GATGAATTTG 


7980 


AAGAGGTTGA 


TTGTGGAGCA 


CTTCAATTTC 


TTCTAGGGAA 


ATTTCCTCTA 


AAACATACTG 


8040 


GCACTGGCGC 


AAGAGATAGC 


GTTGACTGGG 


GCGAGTACTG 


TCCAATTCCT 


TACGGACACC 


8100 


ATGCCAAAGA 


CTGTCTTCCC 


AGTTTCCCAA 


AACTCTAGCC 


GTAATCGGTA 


GTTGATCCAA 


8160 


CAAGTCCAAA 


ATCCTTCTAC 


GCCCTGTCCC 


TGGCATGAGA 


ATATCTCCCA 


AAAGCCAGTA 


8220 


TTCATCCACT 


CCTATCTGCC 


GAGCATCTGC 


CAAAACAGCC 


TCCAAGGCGG 


TGGTATTTCC 


8280 


ATGAATATCT 


GAAAGAAGAG 


CTATTTTCGT 


CATATCCATC 


TCCTCGTTTT 


TTCTCTTGCA 


3340 


ATAAGTATAA 


CATAAAAAGT 


CACAGCTAGA 


GAAATCTAGC 


TTTTTTTGAT 


ATACTAGATA 


8400 


AAGATATTAG 


ACAAGAGGAA 


ACGAATGACC 


CCAAACAAAG 


AAGACTATCT 


AAAATGTATT 


8460 


TATGAAATTG 


GCATAGACCT 


GCATAAGATT 


ACCAACAAGG 


AAATTGCGGC 


TCGCATGCAA 


8520 


GTCTCTCCCC 


CTGCCGTAAC 


TGAAATGATC 


AAACGAATGA 


AAAGTGAAAA 


TCTCATCCTA 


8580 


AAGGACAAGG 


AATGTGGCTA 


TCTACTGACT 


GACCTCGGTC 


TCAAACTGGT 


CTCTGAGCTC 


8640 


TATCGTAAGC 


ACCGCTTGAT 


TGAAGTTTTT 


CTAGTTCATC 


ATTTAGACTA 


TACAAGTGAC 


8700 
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CAGATTCACG AGGAAGCTGA GGTCTTCGAA CACACTGTCT CTGACCTGTT CGTGGAAAGA 8760 

CTAGATAAAC TGCTAGGTTT CCCTAAAACC TGCCCCCACG GGGGAACTAT TCCTGCCAAG 8820 

GGAGAACTAC TCGTTGAAAT CAATAACCTC CCACTAGCTG ATATCAAGGA AGCTGGCGCC 8880 

TACCGCCTGA CTCGGGTGCA CGATAGTTTT GACATTCTCC ATTATCTGGA CAAGCACTCA 8940 

CTTCACATCG GTGACCAGCT CCAAGTCAAG CAGTTTGATG GCTTCAGCAA TACCTTCACT 9000 

ATCCTCAGTA ACGACGAGGA TTTACAAGTG AATATGGACA TTGCAAAACA ACTCTATGTC 9060 

GAGAAAATCA ACTAATTTCT CAAGTCCCCT ACCAACCCTG AAAGTTTTAT TTTGGCTCTT 9120 

TGTCAACTGT AGTGGGTTGA AGTCAGCTAA GCTCGAGAAA GGACAAATTT TGTCCTTTCT 9180 

TTTTTGATAT TCAGAGCGAT AAAAATCCGT TTTTTGAAGT TTTCAAAGTT CCGAAAACCA 9240 

AAGGCATTGC GCTTGATAAG TTTGATGAGA TTATTGGTCG CTTCCAGTTT GGCATTAGAA 9300 

TAGTGTAGTT GAAGGGCGTT GACAATCTTT TCTTTATCTT TGAGGAAGGT TTTAAAGACA 93 60 

GTCTGAAAAA TAGGATGAAC CTGCTTTAGA TTGTCCTCAA TGAGTCCGAA AAATTTCTCC 9420 

GGTTTCTTAT TCTGAAAGTG AAACAGCAAG AGTTGATAGA GCTGATAGTG GTGTTTCAAG 9480 

TCTTGTGAAT AGCTCAAAAG CTTGTCTAAA ATCTCTTTAT TGGTTAAGTG CATACGAAAA 9540 

GTAGGACGAT AAAATCGCTT ATCACTCAGT TTACGGCTAT CCTGTTGTAT GAGCTTCCAG 9600 

TAGCGCTTGA TAGCCTTGTA TTCATGGGAT TTTCGATCCA ATTGGTTCAT AATTTGAACA 9660 

CGCACACGAC TCATAGCACG GCTAAGATGT TGTACAATGT GAAAGCGATC CAACACGATT 9720 

TTAGCATTCG GGAGTGAAAC AGTCTGGGAG ACTGTTTCAG CCTGAGCCTA GAAATTTGAA 9780 

AGCGAAGCTG TTTAGCCAAG TCATAGTAAG GACTAAACAT ATCCATCGTA ATGATTTTCA 9840 

CTTGACAACG AACGGCTCTA TCGTAGCGAA GAAAGTGATT TCGGATGACA GCTTGTGTTC 9900 

TGCCTTCAAG AACAGTGATA ATATTAAGAT TATCAAAATC TTGCGCAATG AAACTCATCT 9960 

TTCCCTTAGT GAAGGCATAC TCATCCCAAG ACATAATCTT TGGAAGCCGA GAAAAATCAT 10020 

GCTCAAAGTG AAAGTCATTG AGCTTGCGAA TGACAGTTGA AGTTGAAATG GCCAGCTGAT 10080 

GGGCAATATC AGTCATAGAA ATTTTTTCAA TTAACTTTTG AGCAATyTTT TGGTTGATGA 10140 

TACGAGGGAT TTGGTGATTT TTCTTTACCA GGGGAGTCTC AGCAACCATC ATTTTTGAAC 10200 

AGTGATAGCA CTTGAAACGA CGCTTTCTAA GGAGAATTCT AGAAGGCATA CCAGTCGTTT 10260 

CAAGATAAGG AATTTTAGAA GGTTTTTGAA AGTCATATTT CTTCAATTGG TTTCCGCACT 10320 

CAGGGCAAGA TGGGGCGTCG TAGTCCAGTT TGGCGATGAT TTCCTTGTGT GTATCCTTAT 10380 

TGATGATGTC TAAAATCTGG ATATTAGGGT CTTTAATGTC TAGTAATTTT GTGATAAAAT 10440 
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GTAATTGTTC CATATGATTC TTTCTAATGA GTTGTTTTGT CGCTTTTCAT TATAGGTCAT 10500 

ATGGGACTTT TTTTCTACAA TAAAATAGGC TCCATAATAT CTATAGTGGA TTTACCCACT 10560 

ACAAATATTA TAGAACCGTA AAAATAGAAG GAGATAGCAG GTTTTCAAGC CTGCTATCTT 10620 

TTTTTGATGA CATTCAGGCT GATACGAAAT CATAAGAGGT CTGAAACTAC TTTCAGAGTA 10680 

GTCTGTTCTA TAAAATATAG TAGATTGAAA TAAGATGTGA ACAACTCTAT CAGGAAAGTC 10740 

AAATTAATTT ATAGAATTAT TTTAGCAGTC AAGGTGTACT GTTATAGATT CAATATATTA 10800 

TATGACTATT AACCTTGTCT TCTCCTAAAA TTGACTTTCT TGTTTTCTTA TCTTGTCCAC 10860 

TCGAAACAAG TATTGTAAGA ATTTGATTAT TTTTGAAAGT ACTTTTAATA TACTTGATAT 10920 

AGTTAAAAAA GATTTGAAAC TAAATTCCAA ATTAGAAAAA GACTTGAAAT ACTAAAAAAA 10980 

AAAAAGTATA CTCTAATTGA AAACGGTAAC AAAACTAATT TAGAGAATGA AATATAGAGT 11040 

ATTTCTCTCT TAAAAGTTTT TGGTGAAACG AGATGTAGAA AGGAGATTTA GCCAAAGAGT 11100 

CTATTAGTGC TAGAATAATA GATTAGAATT ATTTTAGAAA AACGAAGTGA GCAGCTTATA 11160 

AATTCAAGTC CCCAAATAGA TTCATACTAG TATCTTTTGC AAAAAATAAA GGGCGACTTC 11220 

CTTCATGAAT ATCAATTTCA TCTATAAGGA AGGTAGCTAA TTGAACTAAC TTATTTATTC 11280 

TGTTTGTCGC TAGAAAAATC AGACCTCCTT GTGAAGATTG AGGAGATACT TAATGAAAAT 11340 

CAAAGAAGAA ACTAGCAAGC TAGTAGCAGA TTGCCCAAAA CACCGCTTTG AGGTTGTAGA 11400 

TAAGACTGAC CTATATAATC CAAGGTGAAG CGACTGTGGT TTGAAGAGAT TTTCAAAGAG 11460 

TATAGGCTAG AGAGTAGTGT TTTTATGTCC TTCTAGTAGA AAATGCTAGA CAGAAGAATG 11520 

GGGAACTTGG ATAGGAAAAA TAGATTGAGA AAGGAGGTTA GAAGAGATGA TTATTACAAA 11580 

AATTAGCCGT TTAGGAACTT ATGTGGGAGT AAATCCACAT TTTGCAACAT TAATAGATTT 11640 

TCTAGAAAAA ACAGGACTAG AAAATTTAAC AGAAGGTTCG ATTGCTATCG ATGGTAATCG 11700 

ATTGTTTGGG AATTGCTTTA CTTATCTAGC AGATGGTCAA GCAGGGGCTT TCTTTGAAAC 11760 

CCACCAAAAA TATTTGGATA TTCATTTAGT TTTGGAAAAC GAAGAAGCCA TGGCTGTTAC 11820 

ATOGCCGGAA AATGTAAGCG TTACCCAAGA ATATGATGAA GAGAAAGATA TTGAATTATA 11880 

CACAGGGAAA GTGGAACAGT TGGTTCATTT GAGAGCTGGC GAATGCCTCA TCACTTTTCC 11940 

AGAAGATTTA CATCAACCCA AGGTTCGTAT AAATGATGAA CCTGTGAAAA AAGTTGTCTT 12000 

TAAAGTTGCG ATTTCTTAAT GTAGAAAGAG AAGAACGATG AAAAAAATGA GAAAGTTTTT 12060 

ATGTCTAGCT GGAATTGCGC TAGCGGCTGT TGCCTTGGTA GCTTGTTCAG GAAAAAAAGA 12120 

AGCTACAACT AGTACTGAAC CACCAACAGA ATTATCTGGT GAGATTACAA TGTGGCACTC 12180 

CTTTACTCAA GGACCCCGTT TAGAAAGTAT TCAAAAATCA GCAGATGCTT TCATGCAAAA 12240 
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GCATCCAAAA ACGAAAATCA AGATTGAAAC ATTTTCTTGG AATGACTTCT ATACTAAATG 12300 

GACTACAGGT TTAGCAAATG GAAATGTGCC AGATATCAGT ACAGCTCTTC CTAACCAAGT 12360 

AATGGAAATG GTCAACTCAG ATGCTTTGGT TCCGCTAAAT GATTCTATCA AGCGTATTGG 12420 

ACAAGATAAA TTTAACGAAA CTGCCTTAAA TGAAGCAAAA ATCGGAGATG ATTACTACTC 12480 

TGTTCCTCTT TATTCACATG CACAAGTCAT GTGGGTTAGA ACAGATTTGT TAAAAGAACA 12540 

TAATATTGAG GTTCCTAAAA CTTGGGATCA ACTCTATGAA GCTTCTAAAA AATTGAAAGA 12600 

AGCTGGAGTT TATGGCTTGT CTGTTCCGTT TGGAACAAAT GACTTAATGG CAACACGTTT 12660 

CTTGAACTTC TACGTACGTA rTGGTGGAGG AAG CCTCTTA ACAAAAGATC TTAAAGCAGA 12720 

CTTGACAAGC CAACTTGCTC AAGATGGTAT TAAATACTGG GTTAAATTGT ATAAAGAAAT 12780 

CTCACCTCAA GATTCTTTGA ACTTTAATGT CCTTCAACAA GCTACCTTGT TCTATCAAGG 12840 

AAAAACAGCA TTTGACTTTA ACTCTGGCTT CCATATCGGA GGAATTAATG CCAACAGTCC 12900 

TCAATTGATT GATTCGATTG ATGCTTATCC TATTCCAAAA ATCAAAGAGT CTGATAAAGA 12960 

CCAAGGAATT GAAACCTCAA ACATTCCAAT GGTTGTTTGG AAAAATTCAA AACATCCAGA 13020 

AGTTGCTAAA GCATTCTTAG AAGCACTTTA TAATGAAGAA GACTACGTTA AATTCCTTGA 13080 

TTCAACTCCA GTAGGTATGT TGCCAACTAT TAAGGGGATT AGCGATTCTG CAGCCTATAA 13140 

AGAAAATGAA ACTCGTAAGA AATTTAAACA TGCTGAAGAA GTAATTACTG AAGCTGTTAA 13200 

AAAAGGTACT GCTATTGGTT ATGAAAATGG GCCAAGTGTA CAAGCTGGTA TGTTGACTAA 13260 

CCAACACATT ATTGAACAAA TGTTCCAAGA TATCATTACA AATGGAACAG ATCCTATGAA 13320 

AGCAGCAAAA GAAGCAGAAA AACAATTAAA TGATTTATTT GAGGCTGTTC AGTAGATGTA 13380 

AAAGACTAGA AAATAGGTGG GATAGTGAGC TGAAAAGCTC TAGCCCAATC TTGTAAAAGA 13440 

AGGGAGAAGG AGAATGGTTA AAGAACGTAA TTTAACTCGC TGGATATTTG TTTTGCCAGC 13500 

TATGATTATC GTAGGATTAC TCTTTGTTTA TCCGTTTTTC TCGAGTATTT TTTATAGCTT 13560 

TACCAATAAG CATTTGATTA TGCCTAATTA TAAATTTGTT GGTTTGGCTA ACTATAAAGC 13620 

TGTGCTATCA GATCCCAACT TCTTTAATGC GTTCTTTAAT TCAATTAAGT GGACCGTTTT 13680 

CTCATTAGTT GGTCAAGTTT TAGTAGGGTT TGTATTGGCT TTAGCTCTTC ACAGAGTACG 13740 

CCACTTCAAG AAATTATATA GGACATTATT GATTGTTCCT TGGGCATTTC CTACCATCGT . 13800 

TATTGCCTTC TCTTGGCAGT GGATTCTAAA CGGGGTTTAT GGCTACTTAC CTAATCTAAT 13860 

CGTAAAATTA GGTTTAATGG AACATACACC TGCATTTTTG ACAGATAGTA CATGGGCATT 13920 

CCTATGTTTG GTGTTTATCA ACATTTGGTT TGGAGCACCA ATGATTATGG TTAATGTGCT 13980 
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TTCAGCTTTG CAAACAGTAC CAGAAGAACA ATTTGAGGCT GCTAAGATAG 


ATGGTGCTTC 


14040 


AAGTTGGCAG GTGTTCAAGT TTATCGTCTT TCCACATATT AAAGTGGTTG 


TAGGACTTCT 


14100 


AGTTGTTTTG AGAACTGTAT GGATCTTTAA TAACTTTGAC ATTATCTACC 


TCATTACTGG 


14160 


TGGTGGACCA GCCAATGCTA CAACGACGCT TCCAATTTTT GCTTACAACC 


TGGGCTGGGG 


14220 


AACTAAATTG TTGGGTCGTG CTTCAGCAGT TACAGTACTG CTCTTTATCT 


TCTTGGTGGC 


14280 


GATTTGCTTT ATCTACTTTG CTATCATCAG TAAGTGGGAA AAGGAGGGTA 


GAAAATAATG 


14340 


AAGAAGAAAT CCAGTATTTA TTTAGATATT CTCTCACATG TACTTTTAGT 


TGGTGCGACC 


14400 


ATCGTTGCAG TTTTCCCATT GGTATGGATT ATCATATCTT CTGTCAAAGG 


GAAAGGGGAA 


14460 


TTAACTCAGT ATCCAACACG ATTTTGGCCT GAACAGTTTA CATTAGATTA 


TTTCACTCAT 


14520 


GTTATCAACG ATTTGCACTT CATTGATAAC ATTCGAAACA GTTTAATCAT 


TGCCTTGGCT 


14580 


ACAACCCTTA TTGCGATTAT TATTTCTGCT ATGGCAGCCT ATGGTATTGT TCGATTCTTT 


14640 


CCTAAATTGG GAGCAATCAT GTCGAGACTA CTCGTCATTA CCTACATTTT 


CCCACCAATT 


14700 


TTGTTAGCAA TTCCCTATTC AATTGCCATT GCTAAAGTTG GGTTAACAAA 


TAGTTTATTT 


14760 


GGCTTGATGA TGGTTTATCT ATCTTTTAGT GTTCCATATG CAGTTTGGCT 


CTTAGTTGGA 


14820 


TTTTTCCAAA CAGTTCCAAT TGGAATTGAA GAAGCGGCTA GAATTGATGG 


TGCAAATAAA 


14880 


TTTGTTACGT TTTATAAAGT TGTGCTACCG ATTGTAGCAC CAGGTATTGT 


AGCAACAGCT 


14940 


ATTTATACAT TTATCAATGC TTGGAATGAA TTCCTGTATG CCTTGATTTT 


GATTAACAAT 


15000 


ACAGGAAAGA TGACAGTAGC AGTAGCCCTT CGTTCACTTA ATGGTTCAGA 


AATACTAGAC 


15060 


TGGGGAGATA TGATGGCAGC GTCTGTTATT GTAGTTCTTC CATCAATTAT 


TTTCTTCTCT 


15120 


ATCATCCAAA ATAAGATTGC AAGTGGATTA TCAGAAGGAT CTGTGAAGTA 


GACGAAAGAA 


15180 


GGAAAAAAAT GAATAAAAGA GGTCTTTATT CAAAACTAGG AATTTCCGTT 


GTAGGCATTA 


15240 


GTCTTTTAAT GGGAGTCCCC ACTTTGATTC ATGCGAATGA ATTAAACTAT 


GGTCAACTGT 


15300 


CCATATCTCC TATTTTTCAA GGAGGTTCAT ATCAACTGAA CAATAAGAGT 


ATAGATATCA 


15360 


GCTCTTTGTT ATTAGATAAA TTGTCTGGAG AGAGTCAGAC AGTAGTAATG 


AAATTTAAAG 


15420 


CAGATAAACC AAACTCTCTT CAAGCTTTGT TTGGCCTATC TAATAGTAAA 


GCAGGCTTTA 


15480 


AAAATAATTA CTTTTCAATT TTCATGAGAG ATTCTGGTGA GATAGGTGTA 


GAAATAAGAG 


15540 


ACGCCCAAAA GGGAATAAAT TATTTATTTT CCAGACCAGC TTCATTATGG 


GGAAAACATA 


15600 


AAGGACAGGC AGTTGAAAAT ACACTAGTAT TTGTATCTGA TTCTAAAGAT AAAACATACA 


15660 


CAATGTATGT TAATGGAATA GAAGTGTTCT CTGAAACAGT TGATACATTT 


TTGCCAATTT 


15720 


CAAATATAAA TGGTATAGAT AAGGCAACAC TAGGAGCTGT TAATCGTGAA 


GGTAAGGAAC 


15780 
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ATTACCTCGC 


AAAAGGAAGT 


ATTGATGAAA 


TCAGTCTATT 


TAACAAAGCA 


ATTAGTGATC 


15840 


AGGAAGTTTC 


AACTATTCCC 


TTGTCAAATC 


CATTTCAGTT 


AATTTTCCAA 


TCAGGAGATT 


15900 


CTACTCAAGC 


TAACTATTTT 


AGAATACCGA 


CACTATATAC 


ATTAAGTAGT 


GGAAGAGTTC 


15960 


TATCAAGTAT 


TGATGCACGT 


TATGGTGGGA 


CTCATGATTC 


TAAAAGTAAG 


ATTAATATTG 


1602O 


CCACTTCTTA 


TAGTGATGAT 


AATGGGAAAA 


CGTGGAGTGA 


GCCAATTTTT 


GCTATGAAGT 


16080 


TTAATGACTA 


TGAGGAGCAG 


TTAGTTTACT 


GGCCACGAGA 


TAATAAATTA 


AAGAATAGTC 


16140 


AAATTAGTGG 


AAGTGCTTCA 


TTCATAGATT 


CATCCATTGT 


TGAAGATAAA 


AAATCTGGGA 


16200 


AAACGATATT 


ACTAGCTGAT 


GTTATGCCTG 


CGGGTATTGG AAATAATAAT 


GCAAATAAAG 


16260 


CCGACTCAGG 


TTTTAAAGAA ATAAATGGTC 


ATTATTATTT 


AAAACTAAAG 


AAGAATGGAG 


16320 


ATAACGATTT 


CCGTTATACA 


GTTAGAGAAA 


ATGGTGTCGT 


TTATAATGAA 


ACAACTAATA 


16380 


AACCTACAAA 


TTATACTATA 




ATGAAGTTTT 


GGAGGGAGGA 


AAGTCTTTAA 


16440 


CAGTCGAACA 


ATATTCGGTT 


GATTTTGATA 


GTGGCTCTTT AAGAGAAAGG 


CATAATGGAA 


16500 


AACAGGTTCC 


TATGAATGTT 


TTCTACAAAG 


ATTCGTTATT 


TAAAGTGACT 


CCTACTAATT 


16560 


ATATAGCAAT 


GACAACTAGT 


CAGAATAGAG 


GAGAGAGTTG 


GGAACAATTT 


AAGTTGTTGC 


16620 


CTCCGTTCTT 


AGGAGAAAAA 


CATAATGGAA 


CTTACTTATG 


TCCCGGACAA 


GGTTTAGCAT 


16680 


TAAAATCAAG 


TAACAGATTG 


ATTTTTGCAA 


CATATACTAG 


TGGAGAACTA 


ACCTATCTCA 


16740 


TTTCTGATGA 


TAGTGGTCAA 


ACATGGAAGA 


AATCCTCAGC 


TTCAATTCCG 


TTTAAAAATG 


16800 


CAACAGCAGA 


AGCACAAATG 


GTTGAACTGA 


GAGATGGTGT 


GATTAGAACA 


TTCTTTAGAA 


16860 


CCACTACAGG 


TAAGATAGCT 


TATATGACTA 


GTAGAGATTC 


TGGAGAAACA 


TGGTCGAAAG 


16920 


TTTCGTATAT 


TGATGGAATC 


CAACAAACTT 


CATATGGCAC 


ACAAGTATCT 


GCAATTAAAT 


16980 


ACTCTCAATT AATTGATGGA 


AAAGAAGCAG 


TCATTTTGAG 


TACACCAAAT 


TCTAGAAGTG 


17040 


GCCGCAAGGG 


AGGCCAATTA 


GTTGTCGGTT 


TAGTCAATAA 


AGAAGATGAT 


AGTATTGATT 


17100 


GGAAATACCA CTATGATATT 


GATTTGCCTT 


CGTATGGTTA 


TGCCTATTCT 


GCGATTACAG 


17160 


AATTGCCAAA 


TCATCACATA 


GGTGTACTGT 


TTGAAAAATA 


TGATTCGTGG TCGAGAAATG 


17220 


AATTGCATTT 


AAGCAATGTA 


GTTCAGTATA 


TAGATTTGGA 


AATTAATGAT 


TTAACAAAAT 


17280 


AAAGGAGAAA 


AACATGGTTA 


AATACGGTGT 


TGTTGGAACA 


GGGTATTTTG 


GAGCTGAATT - 


17340 


GGCTCGCTAC 


ATGCAAAAGA 


ATGATGGAGC 


AGAGATTACT 


CTTCTCTATG 


ATCCAGATAA 


17400 


TGCAGAGGCG 


ATTGCAGAAG 


AATTGGGAGC 


AAAAGTAGCA 


AGTTCCTTAG 


ATGAGTTGGT 


17460 


TTCTAGCGAT 


GAAGTAGATT 


GTGTTATCGT 


CGCAACTCCA 


AATAATCTTC 


ATAAGGAACC 


17520 
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GGTTATTAAG GCTGCACAGC ATGGTAAAAA TGTTTTCTGT GAAAAACCAA TTGCGCTTTC 17580 

TTATCAAGAT TGTCGCGAGA TGGTAGATGC GTGTAAAGAA AACAATGTAA CCTTTATGGC 17640 

AGGACATATT ATGAATTTCT TTAATGGTGT TCATCATGCA AAAGAACTCA TTAATCAAGG 17700 

AGTTATCGGA GACGTTCTAT ATTGTCATAC AGCTCGTAAT GGTTGGGAAG AACAACAACC 17760 

GTCAGTATCA TGGAAAAAAA TTCGTGAAAA ATCAGGTGGT CACTTGTATC ACCACATCCA 17820 

TGAATTGGAT TGCGTTCAAT TCCTTATGGG GGGCATGCCT GAAACTGTAA CCATGACAGG 17880 

TGGAAATGTG GCCCATGAAG GTGAACATTT CGGTGATGAA GATGATATGA TTTTTGTCAA 17940 

TATGGAATTT TCTAATAAGC GTTTTGCCTT GTTAGAATGG GGTTCAGCTT ATCGTTGGGG 18000 

TGAACATTAT GTCTTAATCC AAGGAAGCAA AGGTGCCATC CGCTTAGACT TATTCAACTG 18060 

TAAAGGAACT CTTAAGCTAG ATGGGCAAGA AAGCTATTTC TTGATTCACG AATCGCAAGA 18120 

AGAAGATGAT GATCGGACTC GTATCTATCA TAGTACAGAG ATGGATGGAG CAATTGCTTA 18180 

TGGTAAACCA GGTAAACGTA CTCCATTATG GCTATCATCT GTCATTGATA AAGAAATGCG 18240 

CTATCTGCAT GAGATTATGG AAGGAGCTCC AGTATCAGAA GAATTTGCAA AACTTTTGAC 18300 

AGGrGAAGCT GCCCTAGAAG CAATTGCTAC TGCAGATGCT TGTACCCAGT CTATGTTTGA 18360 

AGATCGCAAA GTAAAATTGT CAGAAATTGT AAAATAAATT TTGGTATTCT CCTATTTATA 18420 

GGTCGACTTG CTCCTCTGAA AGTACTTTTA GAGGAGCTGT TTGACTTTGC TAGTTTTTGA 18480 

AACTGAAATC TATTATACTA CAAACTATTG AAAGCGTTTT AATTTTAAGG TATAATAATC 18540 

TCATAGAAAT AAAGAAAAGG AGGAAAGAGG ATGCCACAGA TTAGCAAAGA AGCCTTGATT 18600 

GAGCAAATCA AAGATGGAAT CATCGTTTCT TGTCAGGCTC TTCCTCATGA ACCGCTTTAT 18660 

ACAGAAGCGG GAGGGGTGAT TCCCTTGCTG GTCAAAGCGG CTGAGCAAGG TGGAGCAGTC 18720 

GGTATCCGAG CAAACAGTGT TCGCGATATC AAGGAAATTA AGGAAGTCAC TAAACTTCCA 18780 

ATCATTGGGA TTATCAAACG TGATTATCCA CCTCAGGAAC CCTTCATCAC GGCTACTATG 18840 

AAAGAAGTTG ATGAATTGGC AGAACTGGAC ATCGAGGTGA TTGCTCTGGA TTGTACCAAG 18900 

CGTGAACGCT ACGATGGTTT GGAAATTCAA GAGTTCATTC GTCAGGTTAA GGAGAAATAT 18960 

CCTAATCAGC TTTTGATGGC TGATACTAGT ATCTTCGAAG AAGGGCTAGC AGCTGTAGAA 19020 

GCAGGAATTG ACTTTGTCGG AACAACCTTA TCAGGCTACA CATCCTACAG TCCAAAAGTA 190B0 

GACGGTCCAG ATTTTGAATT GATTAAGAAA CTCTGTGATG CTGGTGTAGA TGTCATTGCA 19140 

GAAGGAAAAA TTCATACACC AGAACAAGCC AAACAAATCC TTGAATATGG AGTGCGAGGC 19200 

ATCGTTGTTG GTGGCGCCAT TACTAGACCA AAAGAGATTA CAGAACGCTT CGTTGCTAGT 19260 

CTTAAATAAG ATGTGAGGGG GAGTTTTATG TTTAAAGTTT TACAAAAAGT TGGAAAAGCT 19320 
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TTTATGTTAC 


CTATAGCTAT 


ACTTCCTGCA 


GCAGGTCTAC 


TTTTGGGGAT 


TGGTGGTGCA 


19380 


CTTTCAAACC 


CAACCACGAT 


AGCAACTTAT 


CCAATACTAG 


ACAATAGTAT 


TTTTCAATCA 


19440 


ATATTCCAAG 


TAATGAGCTC 


TGCAGGAGAG 


GTTGTATTCA 


GTAATTTGTC 


ACTACTTCTC 


19500 


TGTGTGGGAT 


TATGTATTGG 


CTTAGCGAAA 


CGAGATAAAG 


GAACCGCTGC 


GTTAGCAGGA 


19560 


GTAACTGGTT 


ACTTAGTTAT 


GACTGCAACG 


ATCAAAGCTT 


TGGTAAAACT 


TTTTATGGCA 


19620 


GAAGGATCTG 


CAATTGATAC 


TGGAGTTATT 


GGAGCATTAG 


TTGTCGGAAT 


AGTTGCCGTA 


19680 


TATTTGCACA 


ACCGATATAA 


CAATATTCAA 


TTACCTTCCG 


CTTTAGGATT 


CTTTGGAGGT 


19740 


TCACGCTTCG TTCCTATTGT 


TACATCGTTC 


TCTTCTATCT TGATTGGCTT 


TGTCTTCTTT 


19800 


GTTATTTGGC 


CACCTTTCCA 


ACAACTTCTT 


GTTTCTACAG 


GTGGATATAT 


TTCTCAGGCG 


19860 


GGTCCAATTG 


GAACTTTTCT 


ATATGGATTT 


TTAATGAGAC 


TTTCTGGAGC 


AGTAGGCTTA 


19920 


CATCATATAA TTTACCCTAT 


GTTTTGGTAT 


ACTGAACTTG GTGGTGTTGA AACTGTTGCA 


19980 


GGACAAACAG 


TGGTTGGAGC 


TCAAAAAATA 


TTTTTTGCTC 


AATTAGCCGA 


TTTGGCCCAT 


20040 


TCTGGATTAT TTACAGAAGG AACAAGGTTT TTTGCAGGTC GTTTCTCAAC 


AATGATGTTC 


20100 


GGTTTACCGG 


CTGCCTGTTT 


AGCGATGTAC 


CATAGTGTTC 


CTAAAAATCG 


TCGTAAAAAA 


20160 


TACGCGGGTT 


TGTTTTTTGG 


AGTTGCTTTA 


ACATCTTTTA 


TTACCGGTAT 


TACAGAACCA 


20220 


ATTGAATTTA 


TGTTTCTATT 


CGTCAGTCCG 


GTTCTATATG 


TTGTTCACGC 


ATTCCTTGAT 


20280 


GGTGTTAGCT 


TCTTTATTGC 


AGACGTCTTA 


AATATTTCAA 


TAGGAAACAC 


ATTTTCAGGA 


20340 


GGTGTAATCG 


ATTTCACTTT 


ATTTGGAATT 


TTGCAGGGGA 


ACGCTAAGAC 


GAATTGGGTT 


20400 


CTTCAGATTC 


CATTTGGACT 


TATTTGGAGT 


GTTTTGTATT 


ATATTATTTT 


TAGATGGTTC 


20460 


ATTACTCAAT 


TCAACGTTCT 


AACGCCAGGG 


CGAGGAGAAG 


AAGTAGATTC 


TAAAGAAATT 


20520 


TCTGAATCCG 


CAGATTCAAC 


TTCAAATACT 


GCAGATTATT 


TAAAACAGGA TAGCCTACAA 


20580 


ATTATCAGAG 


CCTTGGGTGG 


ATCAAATAAT 


ATAGAAGATG 


TAGATGCTTG 


TGTGACACGT 


20640 


TTACGTGTAG 


CTGTAAAAGA AGTTAATCAA 


GTTGATAAAG 


CACTTTTAAA 


ACAAATTGGT 


20700 


GCAGTTGATG 


TCTTAGAAGT 


GAAGGGTGGC 


ATTCAAGCAA 


TCTATGGAGC 


AAAAGCAATC 


20760 


TTATATAAAA 


ATAGTATTAA 


TGAAATTTTA 


GGTGTAGATG 


ATTAAGTACT 


TACTGACTTA 


20820 


ATAAAAAACA 


GAGGAGAGTG 


ATGGATGAGT 


AGGATGAAAT 


GAAATCGCAT 


ACAAGAAATA - 


2O8B0 


AAGAACTCAT 


TATCCAAGTT 


GGATACGCTT 


ATTACATAGG 


AGAATACAAA 


TGAAATTTAG 


20940 


AAAATTAGCT 


TGTACAGTAC 


TTGCGGGTGC 


TGCGGTTCTT 


GGTCTTGCTG 


CTTGTGGCAA 


21000 


TTCTGGCGGA AGTAAAGATG 


CTGCCAAATC 


AGGTGGTGAC 


GGTGCCAAAA 


CAGAAATCAC 


21060 
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TTGGTGGGCA 


TTCCCAGTAT 


TTACCCAAGA 


AAAAACTGGT 


GACGGTGTTG 


GAACTTATGA 


21120 


AAAATCAATC 


ATCGAAGCGT 


TTGAAAAAGC 


AAACCCAGAT 


ATAAAAGTGA 


AATTGGAAAC 


21180 


CATCGACTTC 


AAGTCAGGTC 


CTGAAAAAAT 


CACAACAGCC 


ATCGAAGCAG 


GAACAGCTCC 


21240 


AGACGTACTC 


TTTGATGCAC 


CAGGACGTAT 


CATCCAATAC 


GGTAAAAACG 


GTAAATTGGC 


21300 


TGAGTTGAAT 


GACCTCTTCA 


CAGATGAATT 


TGTTAAAGAT 


GTCAACAATG 


AAAACATCGT 


21360 


ACAAGCAAGT 


AAAGCTGGAG 


ACAAGGCTTA 


TATGTATCCG 


ATTAGTTCTG 


CCCCATTCTA 


21420 


CATGGCAATG 


AACAAGAAAA 


TGTTAGAAGA 


TGCTGGAGTA GCAAACCTTG 


TAAAAGAAGG 


21480 


TTGGACAACT 


GATGATTTTG 


AAAAAGTATT 


GAAAGCACTT 


AAAGACAAGG 


GTTACACACC 


21540 


AGGTTCATTG 


TTCAGTTCTG 


GTCAAGGGGG 


AGACCAAGGA 


ACACGTGCCT 


TTATCTCTAA 


21600 


CCTTTATAGC 


GGTTCTGTAA 


CAGATGAAAA 


AGTTAGCAAA 


TATACAACTG 


ATGATCCTAA 


21660 


ATTCGTCAAA 


GGTCTTGAAA 


AAGCAACTAG 


CTGGATTAAA 


GACAATTTGA 


TCAATAATGG 


21720 


TTCACAATTT 


GACGGTGGGG 


CAGATATCCA 


AAACTTTGCC 


AACGGTCAAA 


CATCTTACAC 


21780 


AATCCTTTGG 


GCACCAGCTC 


AAAATGGTAT 


CCAAGCTAAA 


CTTTTAGAAG 


CAAGTAAGGT 


21B40 


AGAAGTGGTA 


GAAGTACCAT 


TCCCATCAGA 


CGAAGGTAAG 


CCAGCTCTTG 


AGTACCTTGT 


21900 


AAACGGGTTT 


GCAGTATTCA 


ACAATAAAGA 


CGACAAGAAA 


GTCGCTGCAT 


CTAAGAAATT 


21960 


CATCCAGTTT 


ATCGCAGATG 


ACAAGGAGTG 


GGGACCTAAA 


GACGTAGTTC 


GTACAGGTGC 


22020 


TTTCCCAGTC 


CGTACTTCAT 


TTGGAAAACT 


TTATGAAGAC 


AAACGCATGG 


AAACAATCAG 


22080 


CGGCTGGACT 


CAATACTACT 


CACCATACTA 


CAACACTATT 


GATGGATTTG 


CTGAAATGAG 


22140 


AACACTTTGG 


TTCCCAATGT 


TGCAATCTGT 


ATCAAATGGT 


GACGAAAAAC 


CAGCAGATGC 


22200 


TTTGAAAGCC 


TTCACTGAAA 


AAGCGAACGA 


AACAATCAAA 


AAAGCTATGA 


AACAATAGTC 


22260 


CTTAGTTATT 


CTATAAAAAG 


TAGTTTTTTA 


AAGAACCTAA 


GAGTGTATAC 


CCCCTTTTCC 


22320 


CTCTACACAG 


ATAGTGTAAG 


AAAAGGGGGC 


TTTTGTTTAA 


AATGTAAGAA 


ACTGTCACGA 


22380 


AATTAAAATG 


AAGTTCTTAC 


ATAAGCGAAT 


CATAAAAAAT 


TTCATTTTGA 


TTTTAAAACA 


22440 


GTTCAAGAAA 


GTCAAAAAAT 


TATTCTATTT 


GAAAGAGAGG 


TGCCGACTGT 


GAAAGTCAAT 


22500 


AAAATCCGTA 


TGCGGGAAAC 


AGTGATTTCC 


TACGCTTTCC 


TAGCACCAGT 


ATTATTCTTC 


22560 


TTTGTCATCT 


TTGTGTTGGC 


TCCGATGGTG 


ATGGGCTTCA 


TTACAAGTTT 


CTTTAACTAC 


22620 


TCAATGACTA 


AATTTGAGTT 


TGTAGGCTTG 


GATAACTATA 


TCCGTATGTT 


TAAAGATCCT 


22680 


GTCTTTACAA 


AATCTCTGAT 


TAACACAGTT 


ATTTTGGTTA 


TTGGATCTGT 


ACCAGTTGTT 


22740 


GTTCTATTCT 


CACTCTTTGT 


AGCATCTCAG 


ACCTATCATC 


AAAATGTCAT 


TGCCAGATCC 


22800 


TTCTACCGTT 


TCGTCTTCTT 


CCTTCCTGTT 


GTAACGGGTA 


GTGTTGCCGT 


GACAGTTGTT 


22860 
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TGGAAATGGA 
ATCATCAGCC 
ATTATTCTCT 
AATATTGACA 
TTTTGGAAGA 
ACAATTAACT 
TACTCAACAA 
GGCTATGCCA 
CAA1TTAAAG 
ACAGAAAAAA 
ACTGTGCTGT 
GATACAATTG 
CAACTCATGG 
GTAACCATGT 
CGTTTCTATG 
CAAGTTGTCC 
TGGGCAGTTA 
AGTGAAAATA 
CGTACCTTCT 
TTTACCTTCA 
AACAATTTGA 
GGTTTGATTA 
TTCCAAAAAT 
AAAATCTCTT 
CTTCCTAGTT 
AAGTATATTT 
TGCAGTTACT 
TATCAGAAAC 
AGGGATTCAT 



TTTATGACCC 
AAAACATTTC 
TGACCACTTC 
ATTCACTGGT 
TTAAATGGCC 
CATTCCAGTG 
GTACCTTGAT 
ACACAATTGG 
TACTTGGAAA 
AACCATTAAC 
TCATCTTTCC 
TTATTCCTCC 
TGCAGAACCC 
TCTTAGTTTG 
GTCAACGCAT 
TTGTACCATT 
TCTTGCCTTT 
TCCCTACAGA 
GGAGTGTAGC 
TCAATACTTG 
CCATCTCACT 
TGGCAGGAGC 
CCTTCACACA 
CAAACTACGT 
TGTTCTTCAA 
TATAGATTTA 
TTATGAAGTT 
GAAGGAAAGA 
CCTAATTTAG 



ACTATCAGGG 
TTGGTTGGGA 
AGTTGGTCAG 
TGAAGCGGCG 
AAGCCTTCTT 
TTTCGCCTTG 
GTACTACCTT 
TGTCTTCTTG 
CGACGTAGAA 
AGCCTTTACT 
ATTCTACTGG 
TCAGTGGTTC 
TGCCTTGCAA 
TGCAACCTCA 
TCTATTTGCT 
GGTACGTATC 
GATTGGATGG 
GTTGCTTGAA 
CTTCCCGATT 
GAATGACTAC 
TGGGGTTGCG 
TGCCCTTGCT 
GGGTATTACT 
CAGCTTCACC 
TTTTCATTGA 
GAGAATATAG 
TTGTCAGACA 
GTATGATTTT 
ACAAGGCTAT 



171 
ATTCTAAACT 
GATAAAAACT 
CCCATCATCC 
CGTGTTGATG 
CCAACAACTC 
ATTCAGCTTT 
TACGAAAAAG 
GCAGTCATGA 
TACTAAAGAA 
GTTATTTCAA 
ATTTTGACAG 
CCTAAAATGC 
TGGATGTGGA 
TCTCTAGCAG 
ATCTTTATCG 
GTCAACTTCA 
CCATTCGGTG 
TCAGCTAAAA 
GTGAAACCAG 
TTCATGCAAT 
ACCATGCAGG 
GCTGTTCCAA 
ATGGGAGCGG 
TTGCCATACT 
GTATAGGAAA 
AGGTTATAAG 
CTTATAAACT 
TGACGATTTG 
CGACTATCTC 



TTGTCCTTAA 
GGGCATTGAT 
TTTATATCGC 
GTGCAACTGA 
TTTATATTGC 
TGACATCTGG 
CCTTCCAATT 
TTGCTATCGT 
AGGAGACAGC 
CAATCATTTT 
GGGCATTCAA 
CAACCATGGA 
ACTCAGTATT 
GTTATGTATT 
CTGCTATGGC 
TGGGAATCCA 
TCTTCCTCAT 
TCGACGGTTG 
GGTTTGCAGC 
TGGTAATGTT 
CTGAAATGGC 
TCGTCACAGT 
TCAAAGGATA 
TAAGTATTGC 
ATCAATCTAT 
TGTCTACAAA 
TAAGAATGGT 
AAAAACATCA 
TACCAACATC 



GTCCAGCCAC 
GGCGATTATG 
TGCCATGGGG 
GTTTCAAGTT 
AATCATCACA 
TGGTCCAAAC 
GACAGAATAC 
AAGCTTTGTT 
TATGCAATCT 
GCTCTTGTTG 
ATCACAACCT 
AAACTTCCAA 
TATCTCATTG 
GGCTAAAAAA 
GCTTCCAAAA 
TGATACTCTC 
GAAACAGTTC 
TGGTGAGATT 
CCTTGCAATC 
GACTTCACGT 
AACCAACTAT 
CTTCCTAGTC 
ATACTCTGCG 
CTGCGGTTAG 
CAAGATACAG 
ATGGAGGGTA - 
TTTAGTTAAC 
CCTTTTACAA 
GTAAGGATTC 



22920 

22980 

23040 

23100 

23160 

23220 

23280 

23340 

23400 

23460 

23520 

23580 

23640 

23700 

23760 

23820 

23880 

23940 

24000 

24060 

24120 

24180 

24240 

24300 

24360 

24420 

24480 

24540 

24600 
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TTTCGAATTA GGAAAGTATG ATATTGATGG AGATAAAGTC TTTCTAGTTG TTCAGGAAAA 24660 

TGTCCTCAAT CAAGCTGAAA ATGATCAATT TGAGTATCAT AAGAACTATG CAGATTTGCA 24720 

TTTGCTGGTA GAAGGACATG AATATTCGAG CTACGGTTCA CGTATCAAAG ACGAGGCAGT 24780 

AGCATTCGAC GAAGCGAGTG ACATTGGCTT TGTTCATTGT CATGAACACT ACCCACTCTT 24840 

GTTGGGTTAT CACAATTTTG CGATTTTCTT CCCAGGTGAG CCACATCAGC CAAATGGTTA 24900 

TGCAGGCATG GAAGAAAAGG TTCGAAAATA TCTCTTTAAA ATTTTGATTG ATTAAAAATA 24960 

GGATGAATTG TTTTTTTGTA AAGCTTTGAT AATACTCTAC CATGAAATTG ATCTTTGTGA 25020 

GGTAGAGAAA TGAGAATAAA ATATTTAAAA ATTGGTATCT TCTAAGTATG CTGCAAGAGC 25080 

TAGTTTCTTA GATGGACAGG GGATTACAGT TGATGAGATG GCTTGGATAA TTAGGGGCAT 25140 

TGTGAATGCA TTGATTGGTA GATACATAAA ATTAGGTACT TATGCGGCTA AGTATGGTAT 25200 

TAGTATGGCA CGCTCGATCT TAAGTAGGGT AGCTGCAACT GCAGCAGCAA GAGTAGGATT 25260 

ACTGACCAAG ATTTCTGGAT GGATTTTACG AGTAGCTGTG AATGTAGCTG ATGTATATGG 25320 

TAATTTTGCC AACAATATTG CTGCAGCTTG GGATGCATAT GATAAAATTC CTAACAATGG 25380 

TCGTATAAAC TTTTAAAATG CGAGAATGAA AGCACTTTGT ATTTTTTTAT TGAATATGTT 25440 

AGCTTGGACA GTGCTTGCAA TGATAATTCG TGGAGGGCTA GATGGATTTG AT AG GC AT AC 25500 

TTGGAGTACT ATTTTAATTG CGTCGCTGTT CGGGGTATAT GATTATAAGC CCATAGATAA 25560 

AAATAGAAAA AAGTCCAAAA GAAAAAATAG ATTTGTTCAT GGTAGGGACT TATGAAAGCT 25620 

TTACTGACAA AAAAGAAAAC AGTTTACAAA GAAAAATGAT GGAGGAGCAA ACATGGCACA 25680 

AAAAGGAGTA AGCCTTATCA AGGCAGCATT TGATACAGAT AACTTTCTCA TGCGTTTTAG 25740 

TGAGAAGGTC TTGGACATCG TGACAGCCAA TCTTCTTTTT GTCGTCTCTT GTTTACCCAT 25800 

CGTGACGATT GGAGTGGCTA AAATCAGCCT CTACGAGACC ATGTTCGAAG TTAAGAAGAG 25860 

CAGACGGGTG CCTGTTTTTA AAATCTATCT AAGATCTTTC AAGCAAAATC TGAAACTAGG 25920 

TCTTCAGCTG GGTTTAATGG AGTTAGGAAT TGTGTTTCTT ACCCTTTCAG ATCTCTATCT 25980 

TTTCTGGGGT CAAACAGCTC TGCCCTTCCA ATTGCTGAAA GCCATTTGTT TAGGTATTCT 26040 

GATTTTTCTT ACTATCGTGA TGCTGGCTAG TTACCCTATC GCGGCACGTT ATGACCTATC 26100 

TTGGAAAGAA ATTCTTCAAA AAGGATTGAT GTTGGCTAGT TTTAACTTTC CTTGGTTCTT 26160 

CCTCATGTTA GCCATTCTTG TCCTCATTGT GATGGTTCTT TATCTGTCCG CCTTCAGTCT 26220 

ACTCTTAGGT GGCTCAGTCT TCCTACTTTT TGGGTTTGGA CTATTGGTCT TTATCCAGAC 26280 

TGGATTGATG GAGAAAATTT TCGCAAAATA CCAATAGGAG CTTTATTTCT GAAACTACTT 26340 

TCAAAGGCTC CAAACGCTAT TCTATAAGCG AGAAACTAAA ATCGG 26385 
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(2) INFORMATION FOR SEQ ID NO: 4: 

(x) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2716 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



CCTGCCCGCA 


TTGCCCTAGG 


CATTAAGTAA 


ACATATAAAA 


GCATGTGAGA 


GACTGTTGGA 


60 


AAAGCGAGGA 


AATTTCCCCT 


CTTTTCCTCT 


AGTCTCTCCT 


TTCTTTTGCT 


GATTTTATTC 


120 


AAAGAAAATG 


ATATAATAGT 


AGTTATGGAG 


AAAAAGAAAT 


TACGCATCAA 


TATGTTGAGT 


180 


TCAAGTGAGA 


AAGTAGCAGG 


ACAGGGAGTT 


TCAGGTGCTT 


ACCGTGAATT 


AGTTCGTCTT 


240 


CTTCACCGTG 


CTGCCAAGGA 


CCAATTGATT 


GTTACAGAAA 


ATCTTCCAAT 


CGAGGCAGAT 


300 


GTGACTCACT 


TTCATACGAT 


TGATTTTCCC 


TATTATTTAT 


CAACCTTCCA 


AAAGAAACGC 


360 


TCAGGGAGAA AGATTGGCTA 


TGTGCATTTC 


TTGCCAGCTA 


CACTTGAGGG 


AAGTTTGAAA 


420 


ATTCCATTTT 


TCTTAAAGGG 


AATTGTGAAA 


CGCTATGTAT 


TTTCTTTTTA 


CAACCGGATG 


480 


GAGCACTTGG 


TTGTGGTCAA 


TCCTATGTTT 


ATTGAGGATT 


TGGTAGCAGC 


TGGTATTCCA 


540 


CGTGAAAAAG 


TGACCTATAT 


TCCTAACTTT 


GTCAACAAGG 


AAAAATGGCA 


TCCTCTACCA 


600 


CAAGAAGAGG 


TAGTCAGACT 


GCGCACAGAT 


CTTGGTCTTA 


GTGACAATCA 


GTTTATCGTA 


660 


GTAGGTGCTG 


GGCAAGTTCA 


GAAACGTAAA 


GGGATTGATG 


ACTTTATCCG 


TCTGGCTGAG 


720 


GAATTGCCTC 


AGATTACCTT 


TATCTGGGCT 


GGTGGCTTCT 


CTTTTGGTGG 


TATGACAGAT 


780 


GGTTATGAAC 


ACTATAAGAA 


AATTATGGAA 


AATCCCCCTA 


AAAATTTGAT 


TTTTCCAGGC 


840 


ATTGTATCGC 


CAGAGCGGAT 


GCGCGAATTG 


TATGCTCTAG 


CGGATCTTTT 


CTTGTTGCCT 


900 


AGTTACAATG 


AGCTCTTTCC 


TATGACTATT 


TTAGAAGCTG 


CGAGTTGTGA 


GGCTCCTATT 


960 


ATGTTGCGTG 


ATTTAGATCT 


CTATAAGGTG 


ATTTTGGAGG 


GAAATTATCG 


GGCGACAGCG 


1020 


GGTAGAGAAG 


AGATGAAAGA 


GGCTATTTTG 


GAATATCAAG 


CAAATCCTGC 


TGTCTTAAAA 


1080 


GATCTCAAAG 


AAAAGGCTAA 


GAATATTTCC 


AGAGAGTATT 


CTGAAGAGCA 


TCTGTTACAA 


1140 


ATCTGGTTGG 


ACTTTTATGA 


GAAACAAGCC 


GCTTTAGGGA 


GAAAGTAAAA 


AGTGAGGTAA - 


1200 


TCTATGCGAA 


TTGGTTTATT 


TACAGATACC 


TATTTTCCTC 


AGGTTTCTGG 


TGTTGCGACC 


1260 


AGTATTCGAA 


CCTTGAAAAC 


AGAACTTGAA 


AAGCAGGGAC 


ATGCTGTTTT 


TATCTTTACG 


1320 


ACGACAGATA AGGATGTCAA 


TCGCTACGAA 


GATTGGCAAA 


TTATCCGCAT 


TCCAAGTGTT 


1380 
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CCTTTCTTTG CTTTTAAGGA TCGTCGCTTT GCCTACCGAG GTTTTAGCAA GGCACTTGAA 1440 

ATTGCTAAAC AGTATCAGCT AGATATTATC CATACTCAGA CAGAATTTTC TCTTGGCCTG 1500 

TTGGGGATTT GGATTGCGCG TGAATTGAAA ATTCCAGTCA TCCATACCTA TCACACCCAG 1560 

TATGAAGACT ATGTCCATTA TATTGCTAAG GGGATGTTGA TCCGGCCGAG TATGGTCAAG 1620 

TATCTGGTTA GAGGTTTCCT GCATGATGTG GATGGGGTTA TTTGCCCTAG TGAGATTGTC 1680 

CGTGACTTGC TATCTGATTA TAAGGTCAAG GTTGAAAAAC GGGTCATTCC TACTGGGATT 1740 

GAATTAGCCA AGTTTGAGCG TCCGGAAATC AAGCAGGAAA ATTTGAAAGA ACTGCGTAGT 1800 

AAACTAGGGA TTCAAGATGG TGAAAAGACG TTGCTTAGTC TTTCGAGAAT CTCCTATGAA I860 

AAAAATATTC AAGCAGTTTT AGCAGCCTTT GCTGATGTTC TGAAAGAGGA AGACAAGGTT 1920 

AAACTGGTAG TAGCTGGGGA TGGCCCTTAT CTGAATGACC TCAAAGAGCA AGCCCAGAAC 1980 

CTAGAGATTC AAGACTCAGT CATCTTTACA GGGATGATTG CTCCTAGTGA GACGGCTCTT 2040 

TACTATAAAG CGGCGGATTT CTTCATTTCG GCATCGACAA GCGAAACGCA AGGTTTGACC 2100 

TACTTGGAAA GCTTAGCCAG TGGAACACCT GTCATTGCTC ACGGAAATCC TTATTTGAAC 2160 

AACCTCATCA GTGATAAAAT GTTTGGAACC TTGTACTATG GAGAACATGA TTTGGCTGGT 2220 

GCTATTTTGG AAGCCCTGAT TGCAACACCA GACATGAACG AGCATACCTT ATCAGAGAAA 2280 

TTGTATGAGA TTTCAGCTGA GAACTTTGGG AAACGAGTGC ATGAGTTTTA TCTGGATGCC 2340 

ATTATTTCAA ATAACTTCCA GAAAGATTTG GCTAAAGATG ATACGGTCAG TCAGCGTATC 2400 

TTTAAGACAG TTTTGTATCT TCAGCAACAG GTGGTTGCTG TACCTGTAAA AGGATCTAGA 2460 

CGCATGTTGA AGGCTTCAAA AACACAGTTG ATCAGTATGA GAGACTATTG GAAAGACCAT 2520 

GAAGAATAGA AAGAGGAACA GCTATGAAAA AAACAATTAA TGAGAAGCGG TCGTGATAAA 2580 

AAGATTGCGG GTGTTTGTGC TGGGGTGGCC CATTATCTGG ATATGGATCC GACTATCGTT 2640 

CAAGTCATTT GGGGTGTTCT TACTTGCTGT TACGGAGCTG GAATTGTAGC TTACATTATT 2700 

TTATGGATTA TCGCGA 2716 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13926 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CTTTGGTTTT GCCTTATTCA AGACATGAGG GCCATCAGGA ATGATCTGAA ACTGCGAATC 60 



WO 98/18931 



PCT/US97/19588 



175 



TGTTAACAGT 


CTATGGAGAG 


CTTTCATAGA 


ACTAAGATTC 


GGTTTATCTT 


TGCTGCCACA 


120 


AATTAGTAAG 


GTTGGATAAG 


GGTAAGTTCC 


TGCTATATCC 


GTTAAATCAA 


GTGTCTTCAA 


180 


CTCCTCAGAA 


ACTCCGACCA 


TAAGAGTCTT 


GTCTGCTCCC 


TGTTTTTCAA 


ATACTCTTTT 


240 


GGGAAGTAGT 


TTAAAAATCA 


GCAATTGAAG 


ATAAAATAGG 


ATATTCCCTG 


CTAATTTAAG 


300 


CGGGCATCCT 


GACAGAATCA 


AAGCTCGAAG 


ATTTGGTAAA 


TCGTAACTGG 


AAAGTTCTAG 


360 


TGTCAGGGCA 


GCACCTAAGG 


ACAATCCAAT 


CAAAACAAAA 


GGTTCTGTCT 


CTTGAGCTAG 


420 


GTGCTGATAA ACTCGCTCTT 


TAGCTTGTTG 


ATAGTTACTA 


ACTCCAGAAG 


GAAATAACTC 


480 


C AT AC1CCTC A 


unnuunlAnl 


CTGTCAGTAG 


ATTCCGAACT 


TCTTTCCAAG 


ACTCTGCTGA 


540 


CTGCPPTAAr 1 

V. A. X ML 




ATATTAATTT 


CATCTAGTTC 


TCCTCAAGGC 


TTAATTCATA 


600 


pa arz^ , r , T^ ,^ T>^ , 

WinuUL 1 L 1\, 


At. I\»l A\L TAC 


AGCCGTAAAT 


AGCTTCTGCT 


TGGGTTAAAT 


CTGCCAAGGT 


660 


CAAGACTTTC 


i V. i XLl ALL 1 


GTCCTGTTTC 


TAGCAAATGC 


TGACGGTAAA 


TTCCTGGCAA 


720 


GATTCCAAGT 




GTGTGTAGAG 


TTTTCCAGCG 


ATTTTCAGAA 


CCAAATTTCC 


780 


TATAGAGGTT 




CTCCTGACTT 


ATTGTGGTAA 


ATCTTCTCTT 


GTTCTCCTAG 


840 


GCTCAAATGC 


GGTCGGTGAG 


TGGTTTTAAA 


GTAGGTAAAG 


GATTGATTCA 


AAGCAGCTTC 


900 


CTGAAGACAG 


ACTTGGGCCT 


GACAAAAGCT 


TGTACTGAGA 


GGGGTTAATA 


CTTGACGATT 


960 


GACTTCTATC 


TCTCCAGATT 


TGCTAAGGCT 


GATTCGCAAG 


CGGTAATCTC 


GATTAGCTTC 


1020 


ACAATCCTGA 


CACTCTTCCT 


CAATCTTGTG 


TCCCAAGTCT 


TCTGCATCAA 


AAGGAAAAGC 


1080 


AAAATAACGA 


CTAGCTTTTC 


TCAGCCTTTC 


CAGATGTTGT 


TCTTCAAACA 


TCAGTTGTTT 


1140 


TTGGCTGATT 


TTTCCAGTTG 


TAATTAATTG 


GAAGCGAGCT 


TGTTTACGAT 


AGAGAACTGC 


12O0 


TGCCTTTTGA 


TGAACCTCTC 


GGTATTCAGA 


TTCCCATGTG 


CTATCCCAAG 


TAATCCCTCC 


1260 


GCCAACTCCA 


TAAATGGCTT 


GACCTTTGTG 


AAGTTGAATG 


GTACGAATGG 


CCACATTAAA 


1320 


AATCCGTCGT 


CCATTTGGAA 


GCAAGAGACC 


AATCGTTCCA 


CAGTAGACTC 


CACGCGGTTG 


1380 


AGGCTCCAAG 


TCCTTGATAA 


TCTCCATTGT 


CGCAATTTTC 


GGTGCACCCG 


TTATGGAACC 


1440 


ACAAGGAAAG 


AGTGAGCGGA 


AGATTTCAAC 


AAGGTCCACA 


TCCTCTCGCA 


ACTGACTCTT 


1500 


GATGGTCGAA 


GTCATCTGCC 


AAACAGTTGA 


ATACTGCTCT 


ACCTGACACA 


GACGCTCCAC 


1560 


GTGCTCGCTC 


CCAACTTCAG 


AAATACGGTT 


CATATCATTG 


CGCAAGAGGT 


CCACAATCAT - 


1620 


CATATTTTCA 


GAGCGATTTT 


TGGGATCCTG 


TTCCAACCAA 


CTGGCCTGTT 


CAAGATCTTC 


1680 


TTGGTCAGTT 


ACCCCACGCT 


GAGTCGTCCC 


CTTCATTGGT 


CGTGTTGTCA 


ACTCGCGATC 


1740 


ATTTTGCTCA 


AAAAAGAGCT 


CTGGGCTCAT 


GGAAATCACT 


GTCATCTCGT 


CATGTTCCAC 


1800 
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ATAGGCATTG 


176 

TAGCCCGCCT CCTGCTCTAC CACCATACGA TTGTAGATGG CAAAAGGATT 


1860 


GGCATTTAAC 


TTTTGCTTAA GTTGGACGGT GTAGTTGACC TGATAGGTAT CTCCCTGCCG 


1920 


TAAATGATGG 


TGAATTTGGG CAATGGCCTT TTCATAGTCT GCTGCAGACG TTACTTCCTG 


1980 


CCAATTTGAG 


GGCAAATCAA TATCCTCATA AGTCAGAGGA ATAGGGGAAG TTTCTACGAT 


2040 


ATCATGAACA 


GTAAAGTAAA GCAGGTACTC TCCCAGTAGG GGATCCTTGT GAACTGCTAA 


2100 


TTTTTCCTCA 


AAAGCAGGTG CAGCCTCGTA GCTGACATAC CCCACCACAT AATAACCTTG 


2160 


CTCTTGGTAG 


CTTTCCACTT GTGCCAGCAA ATCTGCCACT TCTTCTACAT TTCTCGTTTT 


2220 


CAACTCTTTA 


ATAGGCTGGG TAAAGGTATA TCTCTCCCCC AAAGTCCTAA AATCAATCAC 


2280 


TGTTTTTCTA 


TGCATACCTT AAGTATAGCA TAAAATAAGA AAACCCTCAT CCGCAAAGCA 


2340 


GATGAGAGAT 


TTCAATTATT TAAAGATTGA AGTTTTAAAG CTATTTGTTT GTTGAAGAAG 


2400 


TTTCTTATAA ACAGCTTCTT TTAATTTAAC TGTATTATTC ATAGATACTG TTTTATTACC 


2460 


GTTTGCTTCT 


TGTTTAAGAG TTTCGGCATC TTTTTTAACA GCTTCTTTAA ACAATGTCAG 


2520 


TAAATCATCG 


TATGATGAAA CGGAAGAACC ATTTACTTCG AATGTTGTTA ATCCTTTCGT 


2580 


TGCTTTATCT 


TTAACTTCTT TGAAGTAAGC TTTTTTAAAT TCTTCAATAG TATTAAATGT 


2640 


ATTGTTAGAT 


ATTTTCTTGA TAATATATTC ATCACTTAGA ACAGACTCAC CATCTGTTTT 


2700 


AGATTGTTGT 


TTATATTTAT TTGAAGCATA ACCTAAGAAC CCATTTTCGT ATCCGTAGTA 


2760 


ACCCCATAAT 


CTAAAAGCAT TATGTTTGAA TGAAACAGCT CCAGGAGCAC CTTTACTAGT 


2820 


ATTACCTCCG 


TAGATACCGG TCATCATTCT AACACCTACA TAAGGTGATT GATCGTTATA 


2880 


GCTAATTGCT 


TCGGGTTTAT AGATACCATT ACCTGGATTG CGATTAGTCA TTAATTGTTG 


2940 


ATCAACTAAA 


TCATTAACAG ATTGAATATT TAATTCATTT TTCTCTTCTT GACTTAGATT 


3000 


TCGAATTTTA 


TCCCATTGAT TTAATTTATT GTTATCACGG TATTCTCTAT CTATTTTTTT 


3060 


GAACCATGCA 


CTATTTAAAT CTTTATTTTG TTGAGAAATC ACAGATTCAG CCTCAATTTC 


3120 


ATCAAGAAGA 


GTTAAAGTGT CATTATAACC CTTCATATAT CTATTAATAT CTTCTCGTGT 


3180 


TTTTAGAGTT 


TTTGGATCTG TAATATACCA CTGATTCCCA TCATTTTTGC GTTTAAATAC 


3240 


CATATTAATA 


CCTAAAGAAC CAAACTCATC AAATCCACTA CCAGTAACAG GAGTTTGTAG 


3300 


CATACCCTGA 


GCATATGCTT CAGCATCAGT ACCTTCACGG TGTCCAAAGC CACCTAAGTA 


3360 


AATCGCACGG 


TCGTTGACGT GTGTTGTTTC ATGTGTGTAA ACTGAAATAC CGTATTCACC 


3420 


AACCATTTCT 


AAATGAACAT ATTTTACATC AGTTCTAATA TCATCAGAGT TAGGATATAT 


3480 


AGCAGCATAA 


GCTCCTGTTC CATTATAATT ATAATACTTA TCCATAGGAC CAAAGAATTC 


3540 


TCTAAGAGGA 


GTATATACTT TGTCGGTATT ATAGCGGCCA TATTTTTCAA CCCATCCACC 


3600 
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AGGAGCGTTA TAiPPfprrw* &b aTurrisn a ?\ r» * r*r> * m/tm 

nwjnu^o nn J. nnLL 1 1 AAA 1 AbOAA 1 AAL, ALiC. A 1 CJ T 


CTTAGTAGTC 


GTTGTTTAAC 


3660 


v ** m*v*nu«V. vjV- 1 V"Hj/iv_VxA 1 AV-L,AL»AAA 1\. A 1 AA LfVJTrl L 


CTATAACCAT 


CTGCAGCTTT 


3720 




AATCGCTCTG 


CACTACCAAA 


3780 


w\a wvn a ivjvn l ittl AA lllu AAA I i AAA 1 A AAGA TGTGCT 


TTATCAATAT 


TCAGTAGTGG 


3840 


vm«j 1 « 1 *vj * a i 1 1L1 AAGG i GAUTTCGTTT TAAATTATCG 


AATGCACGAT 


GTTTAGAATT 


3900 


iiiftftiiiLJ iuGACt.TCAG AAGCGCGTTC TGCGATGTAG 


ACATGGTCTT 


CTGTAGCATC 


3960 


AATAAACCAA TCGTTCATAT TGTCTATATT TGTGAACAAT 


TGTCTATTAT AATTTAAAAA 


4020 


luLniLi AAA 1 1 At, v. I\aA II TAGTATATTT AGCC AAT ACT 


TV* 1 A /"</""»/"■* * a mo 

1\>AL (_GAATG 


CGTCGAATGT 


4080 


ACGTGAACCT TTAATGTTGT TCTCTTTAGA ACCGATTTCA 


Al i AA1LIG T 


CTAATACGCT 


4140 


AACTTTTTCA CCATAGAAAT CTGGTTTGAA TAGCATTAAT 


ILl 1 iAAIAT 


TAACATCACC 


4200 


AAATTTAACT CCATAGTAAC GATTTAGGTA AGTTAAACCT 


. AVj 1 AA X AAAu 


L I\»CTTTGTT 


4260 


TTTCTCGACT TTATCACGAA TCATTTGACG AGCAGCTGGA 


GAATCATTTA 


GTTGATGTTC 


4320 


TTCGTTTTGA ACTAATTTTG TGATTAGGTT TGTTAAGTTT 


TCTTTAACAT 


CTGTGAAGCT 


4380 


TTCTTCTAAA TATAAATCTT TGATTGCATT AACTCTATAG 


TCACCTAATC 


GATTTAGATG 


4440 


L.HaAiAt,ATC GTTTGAGACT GAAGCTCTAC TGATTCTAAA 


ATAGATTTTA 


TATCATTAAC 


4500 


AAGAGTAGTG TTATCTTTTT GAACGATATT AGGTGTATAT 


TTAATTCCTA 


AGTCAGTTAT 


4560 


A f2 r P 2i A T^'Pr^P 'I unfit* /^%ntrr<n /"> mm»)tk/«/wninn n/invininxAt\ 

iftiAiiLj ill At, ATT AC TTAAACCTTC ACTGCT AG AA 


GACAAGTTAA 


AGTAATCTTT 


4620 


* * nv-v_o i v_\„ W,AlAt5i\jAA L. AA I AA1 11 J ATTAGCTTCA 


TCTAGGTTTG 


TGATAAACTC 


4680 




TTTAGATGGT 


GTTCTtTATT 


4740 


****** V.Lilui\lftlA LivilniAAlv IIIAIIGIAG 


AATGGTATTA 


ATTTTTCAAG 


4800 


ATTTTTATAG GCTTGGTTAT ATTCAGCGTT ATAATCTTGA 


ATACTAGAAT 


AGGCTTTTTC 


4860 


TTCATTAAGT TTTGCAAGAG GAGATAGATC ACTTTCTAAT 


TTATCAGCAG 


TAATATTGAA 


4920 


AGTAGTAACT TT AG CATC AG CTTGTTCTTT AGTTAATTTA 


GTAAATGTTT 


TAGATTTCCT 


4980 


AAATGATCTA TTACCTGACG AATATCCCTC TACCGCATAT 


AAATCTTTTA 


TATGAGCACT 


5040 


AGCATAATCA GAATCATCAA CGTCGTTAGA GCCGAATAAC 


TCCTCTCCAC 


GGATAATCTT 


5100 


AGCATAGCTG ACAGAATTAC TTACCGTACC TACAGGCCAA 


GTCTTACTTG 


CTATTGCTCC - 


5160 


AACTTCTACT GGATTTGAAA CATCTATTTT ACCTTTTACA 


ACCGACTCAG TTAGGAGAGC 


5220 


TTTTGTACCA ATAAGATGGT CTAGAGTTAA TCCATAATCT 


ACTTTAGGAA 


CTAACAAGCT 


5280 


GGCGCGTGTT TTGTTTCCTG TAATAGTAGC ATCAACATAT 


GCTTTTCTAA 


CAATTCCTCT 


5340 
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ATAGTTTGTA CCTGCAATTC CCCCTGTATG AGAGCCATTT CCACTTGTAG 


AGTGTAGTTT 


5400 


GCCAAAGAAA GCAACATTTT CAATACGAGT TCCATCATTC ATATTATTTA 


CAAATCCAGC 


5460 


AACATTATTA CGACCTGAAA GTGTGCCTGT AATTTTGACA TTTGTAATAA 


CTGAAGAACC 


5520 


TTTCATAGTA TTGGCTAATG ATGCAATATT ATCTTGACCA GAACGTTCTA 


TCTCTACATT 


5580 


TTCAAAATTC ACATTATTTA TCGTTGCGTT TGTTATCACA TTAAATAATG 


GATGTTCCAA 


5640 


TTCAGTAATA GCAAATTGTT TTCCTTCAGA ACTTAAAAGT TTTCCTGTGA 


ATTCTTTAGT 


5700 


GATATATGAT TTTCCATTAG GAACAACATT TCTAGCGCTC ATTGATTGTC 


CCAGACGATA 


5760 


TTCTTTTGAA GGATCGTTTT GAATAGCTTC CACTAATTCT TTGAAATTAT 


AATATACATT 


5820 


ATCTTCGTGG ACTTTAGGTT TTTCAATATA GTGAACGTAT TCTTCTTCAA 


ATTTATTATC 


5880 


AGCAGTTCTA GAGACTAAAT TGTCTGCGAT TGCTGTAACT TTATATACAG 


GTGTTCCGTT 


5940 


AACCGTAGTT TCTTCTATAT TTTTAACAGC TAGTAATGTA GTTTTCTGAT 


TATTTGAAGT 


6000 


TATTTTTAAA TAATAATTGC TCTTATCATC AGGAATAGTT GTTATCAGTG 


ATTCATTAGT 


6060 


TTCTTTTCCA TTTTCGTATT TGATTAAATC TGTACGTTTA ATATTTTTAA 


GCTCAACTTT 


6120 


TTTAAGATCT AATTGAATAT TTTGATTTTC TAGAGTTTCA GTTTCTTCAC 


CGTTACCTCT 


6180 


GTCGTAAATC ATAGTTGTAG ATAGGGTGTA TTCTTTGTAG TACTCTAGGT 


TCTTAAATGC 


6240 


AGCGCTTATA GTTTCTGTTG TTACCTTGTC ATCTGTAAGG ACTACAGTAT 


TAATAACTTC 


6300 


TTCTCCTTTT TTCAATTCAG CTGTGATTGA TTTGATTTTT GTTTTGTTTT 


GATTTTCTAG 


6360 


AGTATACTTA GCAACAGCTT CACGTTCCAA TATTTTCTTA TCGGTACTAG 


TCAATGTTAA 


6420 


TATTGGCTTT TCAGATAATT CAACCAATTT TTCAATAGTT GCAGTTAATT 


TTTCAACAGC 


6480 


TTCGTTAACT TCACTTTGTT TAGCATCTGT ATTAGCTGCA ACTTTTTCAG 


CCTTTGTAAC 


6540 


TTCAGTTTGG AGGTTTTGCC AACTTCTATC ACTGTAATGT TCTTTTACCT 


TTGTTTTTGC 


6600 


ATCTGCAATC GTATTGTTTA ATTCAGTTTT ATCAACGTTT AGAGCGTCAA 


TAGCCGTTTT 


6660 


AAGTTTATTT GTCTCGCTAT TTACCTCAGG CTGTTTTACA GGCTCTGAAG 


CATAGACACC 


6720 


TTTTGCAGTT TCTAAAACAG GTCCAAGAGC ATTGTAACTT GCTGTAGAAT 


AATCAGTAGG 


6780 


AGAAACTGAA CTAGCTTTAT CAATTTGATT ATTTAACTCA CTTTTATCAA 


CTGGTTCTTT 


6840 


AGTACCAATA CCCTTTATTT TATCTTCTGG TTTCGGTGTT TCCTCTACAG 


CCTTCTCTTC 


6900 


TTCAGGAACT TCTGGTTGCT TTTCTGGCTC AACTGGTGCC GTTGGTGCCT 


GTTCGTCTTC 


6960 


TCTTGGCGCG ACTGGTTCAC CTGCTTGTTC AACTTTTGGT TCCTCTGTTG 


GTTCTGTTTG 


7020 


TTTTTCTACA GCAGGCGTTT CAACTTTTGG TTGTTCAATA GATTGATTAA 


CAGTCTCCTC 


7080 


TTTTGGTTCT ACAGTTTCTT CAGCCTTGGT ATCTGGAGTT GACTCTTCTT GTTTCGGTGT 


7140 
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******* GCC^TCT, CT.CAGGAGC TTCTGGTTGC T^c^ 

C^CCTCT TCTCTTGGCG CGACTGGTTC ACCTGC™ 
"" mB,C ' GATGGTTGAC ™ CTCGCT r AAC^CTACT ^ 
AACT^CA CCTACTTCTT CAACTGGAGC TGG^ GAATCTTCTT TCCCC^ 
TACTTTAGGA AGGGTGTCGT CAGTAGGTTT TACCTCCCAT CCTTTGGACT 
TTCTTCTGTT TTAGGTGCTT C«m»0 AGC^CTCT GTCTCTACTA CT^^ 
TGTCCTAGCT 

CTCTCCAGGT ^ 
GTCACCTGAT AGATAACCAA CATAGCGATA GCCC^CATT TCAACAACAC CCTCTCGACT 
AGCCAGCGCT AGGGTCGCAA CTGGGTCTAC AGCCCC^CA CTAGGAAGAA CTACCAATCC 
WMCWA ACTAGAAAGA CGCTAGCAAT TTGTAGATTA AAAGCAAGCT 

CCCAACAGXC AGCAAACCAA AAGCTGTCAA AACAGATGCT TCTGTCCCTG TTTGAGGCAA 
CTGATCTTTT ^ACACCA AACCATATAC AAC^C CTGTCAGGCT TTCCTGTCIX3 
AATTAAATCT TTAGCTTCTT GTGAAATAAT CTCTTTATTT ACATAGTGAT AGGTGGCTGC 
GTCCACTACA GAAGGAGCCA TCAAAAGGCT TCCAAGAAAT ACAGAGCCTA CAAC T C CCTr 
AATCTTACGA ATTGAAAAAC GGTCTTTTTT AAACACTTTT ATCTCCTTTA TTCA^TCA 
AAACTTCCTA ATAGCA TCTT GCGGATAGTG CGCACGCGCA CCTCCGATXA A TOTOGG ACC 
ACTAGCCAGT GCCGTTACAT GGGCATGACC AATCTCTCTC AAAATAGGGC GAATCGGAAC 
CTGAACATGC TTGACATGCA TGCCAATTGC AGTGTCTCCG ATATCCAATC CAGCATGAGC 
CTTGATAAAT TCAACCTCAA CTGGATCCTG CATAAACTTA AAGGCTGCCA AC TC CCCCGA 
ACCTCCTGCA TGAAGAGTAG GATGGACACT GACAATTTCC AGACCAAACT GCTCTGCCAC 
CTGACGTTCA ACAACGAGAG CCCGATTGAC ATGCTCACAA CC^AAC. CTAAATGGAT 
ACC^ACA CCTAGAATAT CCAAGATAGT C^ACXATC AGCTCACCAA TCTCTTGACT 
" Bn " C CCAATATGAC CACCTAGCAC CTCACTAGAA GATAGACCTA AAACAAAAAG 
CCCCCCCTGC TTCAAATTGG TCTTTTCTAA AACATCTTCC ACTACCTGAC GTGTTTCTCT 
TTGAATCTGT GT CTCGTTCA TCTCTGTTAC CTCT CTTCTC actcttctat cataccgttt 
rrrc^r ™ gcaagat agacaaccta gaaagtttgc ccaa^acgc ATAAAACTCC 

CAGAATTGAC TGGGAGTTAG C TAGITTCT a ATATATTTCA AC^cc 

c^gg TCTagaatca atcttcatat tccaamtca agtttgagcc 



7200 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 
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GTTGATCGAC ATTTTGAAGA CCAACTCCCC CACGTTTGAG TTGACTTTGA CTACTATCAC 8940 

CAGCATCTTG GAAGCCAACG CCATCATCCT CAATACGGAT GACCAATCCC GAATCCTGTT 9000 

TCTGGACAGA AAGTTTAATA TGGCCCTGAC CTTCCTTTTC CTTAATGCCA TGGTAAAGAG 9060 

CATTTTCTAC AAGGGGTTGT AGGACCAGCT TGGGTAAGAC TAAATTATCA AAGGCAACAT 9120 

TTTCATTAAT TTCGTATTCC AGCTTATCTC CATAGCGTTG TTTCTGGATA AAGAGATACT 9180 

GGCGGACATG ATTGATTTCG TCAGAGAGAC AAATCAAGTC CTTGCCTTGA TTGAGCGCCA 9240 

AGCGGAAATA GGTTGCCAAG GACTTGGTCA CCTGCACCAC TCGCTGACTA TCATGAAATT 9300 

CAGCCATCCA GATGATGGTG TCCAAAGTGT TATAGAGGAA ATGTGGATTA ATCTGGCTCG 9360 

AAAGGGCTTG AAGTTGGTAC TGACGGGTCG TTTCTTCCTG GCTACGAATA GCTACCATCA 9420 

ACTGATCAAT CTGATCCAAC ATAGCATTAA ATTGGCGAGT TACTTCTCTC AGTTCATAGG 9480 

CACCAACTTC CTTGGCACGA AGATTTTGAG CACCAGAAGC AATTTCCAAC ATGGTTTCTC 9540 

TCAAATCCTT CAAAGGAGCA ATCCAGCGTT TAAGACTGAA CCACACTAAG CAGAGACAGA 9600 

CAAGAAGAGA TGTGACACTG GCCCCAAGCA AGGTCCACAA GAGCTGACTC CGAACCTGGT 9660 

CTAACTTTTC CAATGATGAC ACGCCAAGCA CCGTCCAATC AGTTCCTGCA ATCTTCTCTT 9720 

GACTGACGTA GGATTTGTGA CCAGGAGTAT AACCCTGACC TGTATCGATG TAGGGTTTCA 9780 

TAGCCTCCAT TTTGCTAGAC GAACTATAAA CTGTGTGTTG AGGATGGTAG ACAAATTCAT 9840 

GGTTTTCATT GATAATGAAG GCAAAGCCCT GCTGCCCCAA CTGGAGTTGA TTGAGATAGG 9900 

CTTCCAGAGT TTCATAAGAA ATATCCAAAC GAAGCACACC AAGATTGGCT CCCTTTGCAT 9960 

CAACAAGTTC TTGAGTGACA GAAATGACCC ACTGACTATC TGATTTACGA GCTGGAGTCA 10020 

AAACAGGCAT AGCTCCCTGA TGAATGGCCT TTTGGTACCA ATCCTCAGCC ATCATATCAG 10080 

AGGAAGTTTT CATCTGCACA CTGTCATCTG TAGAAATGAC CTGACCAGAT TTGGTCACCA 1014 0 

GCACAACAGT TTTCAAGTCC TTATCTGACT TCAAGATGGT CAAAAACAAA TCTCGGATTC 10200 

CCTCGACCTT GTCTTGACTG GGATTCTCAG CATAGGCCAG AACATCCGTC TGCTGGGTCA 10260 

AACCAGTCGA GGTGGTTTCT AGTTTTTTGA TATAAGACTG AATAAAGTGG CTAGTCTGGC 10320 

TGATGGTCGT TTGGCTGTTG CCCTCAATGG TGGCCTCAAT GGCTGAAGAA CTTGATTGAT . 10380 

AGTAGAAAGT TCCAACCAGA GCTAGGAGAA TGAGAAAGAC CAGAAAGATG GAAATAACCA 10440 

TTCTAACTAA AAGAGAAGAA CGCTTCATCG GTCTTCTCCC TTCTTAAACT GACGAGGTGT 10500 

CACACCTGCA ATCTGCTTAA AACGTTGGGT AAAATAGTTC ATATCTTCAA AACCAACCTT 10560 

CTCTGCGATC TCATAAATCT TCAGATCTGT AGTTAAAAGC AAGAGCTTGG CTTGTTTAAC 10620 

ACGTTCTCTC ACCAGATAAT CCTGAAAAGG CAAGCCCAAC TCTTTCTTAA TCAAGGAACT 10680 
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CAGATAGGTC 
AGCCAGATGA 
TTGTAACTGC 
CTCAATATCC 
AGACAAGGCA 
GGTTTCTCGT 
TAAAATGATA 
CTGACCGATG 
TACCAGATAT 
TTACTAGTAT 
AATAAAAATC 
GGTTGTAAAT 
TGACGAAGTC 
CGAAGAGTAT 
TTGGGAATAA 
GGACCATCCG 
CGCGTCATAT 
GTAAAACTAG 
CCAGTTGCTA 
GCTCGTTCTG 
AATTCCTCAT 
TTAACATTGA 
TCAGCCACCA 
TTAGCCACCT 
ACACCAGTAC 
ATAATGCGGA 
ACATGGACGG 
CTACCATTTG 
TCCACTCCCC 



GGACTAAAAC 
GACTGGATTT 
TCTTCTTTCT 
TGACGAGAAA 
TAATCAAAAT 
ACCAGACTGG 
TCTGGCACCT 
ATTTCCATAT 
TCATCTTCTA 
CAGTATAGCA 
AAAAAGTAAA 
AAAACTGACG 
GATAACCCTA 
TAATCAACAT 
AGCGGATAGA 
TAAAGACATG 
TGTAGGACTT 
GCCAGCCACA 
TATCCACATA 
TTTGATTTTC 
CACTTGGTTT 
TATGGCAGTA 
CAAAATTCTT 
CATCAAAGAC 
GGTACTGGGT 
AATAGTGAAG 
TTTCTGCATG 
CATAGCCTGA 
AGAAACAACC 



CTAAGTCACT 
TCTGGGCCAT 
CTTCCTTGTC 
AGGGTTTGAG 
CATCGTAACC 
CCAACTGGAT 
GCTTTTGGAT 
CGTAGGCTGC 
CGATTAAGAT 
AAATTCTCCT 
CTAGGAAGAT 
AAGTCGACTC 
CATACGGTAA 
AATCTAGTAA 
GAGGCTATTG 
CCCAAGGTGA 
ATCTTCCTTG 
ACCAGACTCA 
GATACCGGAT 
CTGGGTAACT 
TGGATATTTG 
GCCATTTGGA 
CAAGTTTTCC 
TTGGTTAATC 
CCCCACATCA 
CAGGATTTCC 
ACCTGTTTGG 
AACGGCATCC 
TCCAGCTAGA 



181 
GGCTAAAGAC 
GTTTCCTTCA 
TAGTTTTTGT 
CAGGTAGTCG 
TGTTAAAAAG 
GCCATTTAGA 
CAATTCCCAA 
TACATTGACC 
TGTGTAGGTC 
CTAACTGCTT 
AGCCACAGGT 
AAAGTATAGC 
GGCGACGCTG 
ATAAGCGTAc 
ATACAGTAAC 
GAATCTCCTA 
TAGGTGACAA 
AATTTGTCTT 
TCAAATTTAT 
GCATACTCCT 
CTGGCATCAA 
TTTTTCTTGA 
TTTTCAACTG 
ACTTCCAAAT 
TTTCCTTGTT 
TTGAGAGAAA 
TTAATCAATT 
GTCACCCCGG 
TAAATTTCGT 



TTTAAACTAA 
AACCTATTAG 
TTGATTTTCC 
TCCACACCTA 
ACCAAATGAA 
TGAGGCATGT 
GCCTGCCTTC 
AGTTTAGTCA 
ATGCTCTGCT 
AGGAAAGACC 
TTCTCAAAGT 
TTTGAGGTTG 
ACGTGGTTTG 
CTTTTTCTTC 
GTAAGCCGCC 
CTCGGCTCCG 
CATCTGGACT 
TTGATGAAAA 
CCCAGTAACG 
CAGGTGACAG 
TGACAGGATA 
GATAGTCTTG 
CTAGAGGTTG 
CCTTGTCATC 
TATTTTTGCT 
TTTGCTTGGC 
CGTACTTGGT 
GAACACGTGA 
GCAAGTCTGC 



ATTGGCTATC 
TCAATAAATC 
CCAACATTTC 
GTTTGACAGC 
CCTGAGGATA 
TGATATCGGT 
CATTTTCAGC 
AACCTTGTCT 
CCTTTACCAC 
TCTTATACTC 
ACCGCTTTGA 
TAGATAAAAC 
AAGAGATTTT 
CATTTGGTCT 
CTTGTCCTGT 
CACTTCCATA 
GATGGGTTGG 
GAGAGGTTCC 
GTTTGAGAAA 
GGTCTTTTTC 
GGCCGCCTGA 
ATGGTAATCC 
ATCGTATTTC 
TGTGTAATAA 
GGTTGGATTG 
A7CATAGGTG - 
TGTTTCTCCT 
GAAATATTCC 
GTCTTTACTA 



10740 

10800 

10860 

10920 

10980 

11040 

11100 

11160 

11220 

11280 

11340 

11400 

11460 

11520 

11580 

11640 

11700 

11760 

11820 

11880 

11940 

12000 

12060 

12120 

12180 

12240 

12300 

12360 

12420 
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ATTTCTGTTT 


TTTTCACTGC 


TTTTCCTCCT 


TGGCTAACTG 


CCGCCTTTTC 


AATTTGCGAG 


12480 


GCATCTGTCT 


GCCCTGCATT 


TCGTATCAAT 


AGAACATAGA 


AACCGGTTAT 


GGCTAGAAAA 


12540 


AATACTCCTA 


GCAACAAGAA 


GATTTTTAAC 


TTATCATTCA 


TAAGACGCCT 


CCTAGGCTAA 


12600 


TTCCTTCAAA 


GTTTGCAAAA 


TTGCATCTTT 


TTCCATGAAT 


CCTGGATGTG 


TTTTGACCAG 


12660 


CTTGCCTTCT 


TTGTCTATAA 


AGGCTTGGGT 


TGGGTAAGAA 


CGGACACCAT 


AAGTTTCCAA 


12720 


AAGTTTGCCT 


GATGGGTCAA 


CTAGGACTGG 


GAGATTTTTA 


TAATCCAATC 


CCTTATACCA 


12780 


ATTCTTAAAG 


TCCGCTTCAG 


ATTGCTCTCC 


CTTATGTCCT 


GGTGACACTA 


CTGTCAAGAC 


12840 


CACATAGTCA 


TCACCAGCTT 


CTTTAGCAAT 


CTCATCCGTA 


TCTGGAAGAC 


TAGCCAGACA 


12900 


GATGGAACAC 


CAAGAAGCCC 


AGAATTTGAG 


ATAGACTTTC 


TTGCCCTTGT 


AATCAGATAA 


12960 


ACGGTAGGTC 


TTGCCATCTA 


CTCCCATCAA 


TTCAAAATCA 


GCCACCTCTT 


TCCCTTTAGC 


13020 


TGCGCTTGTT 


TTACTAGCTG 


TCTGCTCCGT 


CTTCATTTCA 


TCTTTCGTTT 


GGTGTTCACT 


13080 


AGTCACGGAC 


TTGCCTGAAC 


AAGCCGTCAA 


ACAAAGGAGC 


GAACCTGCTC 


CAAGAACACA 


13140 


TGTTTGCCAT 


TTTTTCATAT 


TGATATTCCT 


TTCCATTTTA 


TTCAAATAAT 


TGACTTAAAA 


13200 


TTGAAGCATT 


TCCAAACAGA 


ACCAAGAAGC 


CCATCACAAT 


AATGAGAAAA 


CCACCCACTT 


13260 


TTTTGAGGAT 


TCCGAGATAG 


GGATGAAGTT 


TTCGGAAATG 


TTTCAAAACA 


TAACTAGAGG 


13320 


TCAGAGCTAG 


AAGCAAGAAT 


GGTAGCGCCA AGCCCAGCGT ATACACCAAC 


ATGAGACCAG 


13380 


CTCCCTGCCA 


AGCTCCTGAA 


CCACCTGAAG 


CCGCCAAGGC 


CAAAACAGAC 


CCCAGAACCG 


13440 


GCCCCACGCA 


AGGCGTCCAA 


GCAAAACTAA 


AGGTCAAGCC 


CAATAAAAAT 




i i £ fin 


AGCCCTTACC 


ATTTTGCCCC 


TGTCCTTGCA 


GTTGTAGCCT 


CTTTTCCTTA 


TAAAGCCCCT 


13560 


TAAAGTGTAG 


AATCTCCATT 


TGGTGCAAAC 


CAAGAAGGAT 


AATAATTGCC 


CCAGTAAGAT 


13620 


ATTGGAACCA 


AGAAGCATAA AGCAAATCGC 


CTAAAAAACC 


AGCTCCATAG 


CCCAACAAAA 


13680 


TAAATATAAA 


GGAAATTCCT 


GCTATAAAGG 


CCAGAGTTCG 


TAATAAACTA 


GTAACTGAGA 


13740 


TTGAAAATTT 


GCCGCTAGAA 


GCCTGAGCAC 


CATCCTTATC 


ATCTAGTAAC 


ACTCCTGTAT 


13800 


AGACCGGTAA 


CAAAGGTAAG ATACAAGGAG 


AAAAGAAGGA 


TAGAATCCCT 


GCCAAAAAGA 


13860 


CACTTAGAAA 


AAAGAAAATA 


TGACCCATAA 


AGTTCCTCCT 


ATCATTTTAT 


TGATAGATTT 


13920 


ATTATA 












13926 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20199 base pairs 

(B) TYPE: nucleic acid 
(C> STRANDEDNESS : double 
(D) TOPOLOGY: linear 
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(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
CCCAGCAGAA AAATGGCATT TGGAGATAAT GGAAATCGTA AAAAAACTAT GTTTGAGAAA 
ATAACCTTGT TTATCGTGAT TATCATGCTA GTAGCAAGTT TATTGGGAAT TTTTGCAACT 
GCAATTGGTG CCCTCAGTAA TCTATAAAAT AGATTCAAGA AAATTTAGTG ACTGGGATTT 
CCCAGCCCTT TTTTAAAGTG AGAAGAAATA ATGAGTATGT TTTTAGATAC AGCTAAGATT 
AAGGTCAAGG CTGGTAATGG TGGCGATGGT ATGGTTGCCT TTCGTCGTGA AAAATATGTC 
CCTAATGGAG GCCCTTGGGG TGGTGATGGT GGTCGTGGAG GCAATGTGGT CTTCGTTGTA 
GACGAAGGAC TACGTACCTT GATGGATTTC CGCTACAATC GTCATTTCAA GGCTGATTCT 
GGTGAAAAAG GGATGACCAA AGGGATGCAT GGTCGTGGTG CTGAGGACCT TAGAGTTCGA 
GTACCACAAG GTACGACTGT TCGTGATGCG GAGACTGGCA AGGTTTTAAC AGATTTGATT 
GAACATGGGC AAGAATTTAT CGTTGCCCAC GGTGGTCGTG GTGGACGTGG AAATATTCGT 
TTCGCGACAC CAAAAAATCC TGCACCGGAA ATCTCTGAAA ATGGAGAACC AGGTCAGGAA 
CGTGAGTTAC AATTGGAACT AAAAATCTTG GCAGATGTCG GTTTAGTAGG ATTCCCATCT 
GTAGGGAAGT CAACACTTTT AAGTGTTATT ACCTCAGCTA AGCCTAAAAT TGGTGCCTAC 
CACTTTACCA CTATTGTACC AAATTTAGGT ATGGTTCGCA CCCAATCAGG TGAATCCTTT 
GCAGTAGCCG ACTTGCCAGG TTTGATTGAA GGGGCTAGTC AAGGTGTTGG TTTGGGAACT 
CAGTTCCTCC GTCACATCGA GCGTACACGT GTTATCCTTC ACATCATTGA TATGTCAGCT 
AGCGAGGGCC GTGATCCATA TGAGGACTAC CTAGCTATCA ATAAAGAGCT GGAGTCTTAC 
AATCTTCGCC TCATGGAGCG TCCACAGATT ATTGTAGCTA ATAAGATGGA CATGCCTGAG 
AGTCAGGAAA ATCTTGAAGA CTTTAAGAAA AAATTGGCTG AAAATTATGA TGAATTTGAA 
GAGTTACCAG CTATCTTCCC AATTTCTGGA TTGACCAAGC AAGGTCTGGC AACACTTTTA 
GATGCTACAG CTGAATTGTT AGACAAGACA CCAGAATTTT TGCTCTACGA CGAGTCCGAT 
ATGGAAGAAG AAGCTTACTA TGGATTTGAC GAAGAAGAAA AAGCCTTTGA AATTAGTCGT 
GATGACGATG CGACATGGGT ACTTTCTGGT GAAAAACTCA TGAAACTCTT TAATATGACC 
AACTTTGATC GTGATGAATC TGTCATGAAA TTTGCCCGTC AGCTTCGTGG TATGGGGGTT - 
GATGAAGCCC TTCGTGCGOG TGGAGCTAAA GATGGGGATT TGGTCCGCAT TGGTAAATTT 
GAGTTTGAAT TTGTAGACTA GGAGACTGGT ATGGGAGATA AACCGATATC TTTCCGAGAT 
GCGGATGGTA ATTTTGTTTC CGCCGCAGAC GTTTGGAATG AAAAGAAATT GGAAGAACTA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
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TTTAATCGTC 
TCTCAGTAAA 
AGGCATGTAT 
TGAAATGAGA 
AAGCAGAGAT 
CAAAATTAAA 
TCGGCGAGTC 
CTGGGGATAG 
CGAAATCGTG 
CATCAAACTC 
TTGAATTCGT 
ATACACGAGG 
CAGATAGAAG 
TTCTGTGAAC 
CCGATACTTA 
GGCAATAGCG 
CGCCTTGTCG 
GTTTTGAGCA 
AATTCAGTTA 
AGGATAAATT 
GAAAACTTAG 
CTATTAGTGG 
ATGTGGTGAC 
TGGAATTGAT 
GTGTTCAAAA 
TTTATGGGAG 
ATCTTGGTCC 
CTAGCTACGA 
GTATTTACAT 
AAGCAAATGG 



TCAATCCAAA 
GAAGCTAAAA 
AGCAAACTGA 
ACAGGACAAA 
GTACTATTCT 
TTGTTTGATT 
AAATAGCGAT 
ACCGTTTTAA 
GCTCTACGAA 
TAAAGTCCAA 
ACTAAGATTT 
AAAGATGTAC 
TGATCCTGAG 
TGAGAGAAGG 
GATAAGAGAT 
ATTCGAGAAA 
TATGTGTAGG 
ACcTGCGGCT 
CTAACTCGTC 
TATGATATAC 
AATGAGAAAA 
TGCTAAAAAT 
TTTGGATTGC 
GGGAGCTACT 
TATTCCAATG 
CCTCTTAGGC 
TCGTCCGATT 
GGGAGATAAC 
GGATACGGTT 
TCGTACTATT 



TCGTGCCTTG 
AATCCCGTGC 
ATCTGGAATA 
TCGATCAGGA 
AGTTTCAATC 
CTTATTTCAA 
TCCCAAGCCT 
GTCTGACGCT 
CAGGAACGTG 
AAAGGTAGTC 
TCTATTTTCA 
GACTTATCCC 
TCACGGTTAT 
GGGAGAAGTT 
CTAGTCTTAG 
GATTATACTC 
ATACTGACTA 
AGTTTCCTAG 
AACTCTGATT 
TTTATTTTGA 
ATTGTTATCA 
AGTGTCGTTG 
GTTCCAGATA 
GTTAAGCGTT 
CCTTATGGTA 
CGTTTTGGTG 
GACTTACACC 
ATGAAGTTAT 
AGTGTGGGAG 
ATTGAAAATG 



184 
AGATTGGCAC 



CTCATCAGAC 
GCACAGCATA 
CAGTAAAATC 
AACTATATTG 
TTTGTTATAG 
GACTATCGTG 
GGAAATAAGA 
ATAATAAGGC 
GTAACCTATA 
CTGTAACCTT 
GTGAGGTCTA 
CTGTCTGATA 
CTTGCTAAAA 
CTCCTACTCA 
TTCGAAAATC 
CGTCAGTTCC 
TTTGATCTTT 
TATCCAATAA 
AGACCTTATT 
ATGGTGGATT 
CCTTAATTCC 
TTTCGGATGT 
ATGACGATGT 
AAATTAACAG 
AAGCGACAGT 
TTAAGGCGTT 
CTGCTAAAGA 
CAACGATTAA 
CAGCCCGTGA 



GAACTAAAAA 
ACGGGATTTT 
TCTTCTAAAA 
GATTTCTAAC 
TTATAAATTG 
TATATCTGAT 
AGGTAGCGGA 
ATTGTCAGAA 
GTATATAGCG 
TGCGTAAATC 
TTAACGCCCT 
TCACTATAAA 
GGACGGTATG 
TTTAGTTGAA 
GTTTTAGGGG 
TCTTCAAATC 
ATCTACAACC 
GATTTTCATT 
AATTGAAAAG 
AGAAATCTTG 
ACCACTGCAA 
AGCTATTATC 
AGCCAGTCTT 
ATTGGAGATT 
TCTTCGTGCA 
TGGTCTACCG 
TGAAGCTATG 
TACAGGACTT 
TACGATGATT 
ACCTGAGATT 



GGAAAATCCA 
GTGGTACGAC 
TATAGTAAAA 
AATGTTTTAT 
ATTTGAATTT 
GTCAAAGTTC 
TTAAAATGGT 
GAAGGGATAG 
GATAAGAGGG 
ACGAGAGTAA 
TATATCTTGT 
GAGAAAACGA 
TATAAAACGC 
CAGCCGTATT 
ATAAAAAAGG 
ACGTCAATAT 
TCAAAACAGT 
GAGTATTAGT 
GATGGAAAAA 
AAAGAGTATT 
GGTGAAATCA 
TTGGCTGATG 
GTCGAAATCA 
GACCCAAGAG 
TCTTACTATT 
GGAGGATGTG 
GGTGCCACTG " 
CATGGTGCAA 
GCTGCGGTTA 
ATTGATGTAG 



1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 
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CTACTCTCTT 
TTGATGGTGT 
CTGGAACATA 
TTTACGAACA 
TATCTGAAGA 
CAGCTCCTTA 
GAGCGAATGG 
TTGAACTAGC 
GTGGACGTGA 
TAGTCATTGC 
TACGTGGTTA 
TTGAGGATTA 
TTTTTGAAAC 
GCGAGATAGC 
AAGAAAGTCA 
TTTGTGACCA 
TCAAAAAAGA 
TGACAGAGGT 
TATTTATCAT 
TTAGTTTGTT 
TTGAATTTCG 
TTGGAGGAAT 
GTTAGTGATG 
ACCCGTCCTA 
ACGTTTGCTG 
GAGAGAGAAT 
CTTCATGATG 
GAAAATGAAA 
CTGGCCAATC 



GAATAATATG 
TGAAAGATTA 
TATATCTTTA 
CCTGGAAGGG 
CAGCATTTTT 
CCCAGGCTTT 
TCGTGGTACA 
AAAGATGGAT 
TTTACGTGGG 
TGGGCTTATG 
TTCTGATATT 
AACCGTAGAG 
GGATCGCTTG 
TTCAAATCCA 
ATATGCACTG 
GAAAAATCAA 
AGCTGAGCTT 
TGTTAGAAAA 
TACCCACCTT 
CCGTCAGTTT 
GTATGTAAAA 
TACCAGTAGG 
GAACAGCCTA 
GAAGTGGAAC 
AAATTGCAGA 
TTAGTAAGTT 
GGGGGCTCTT 
ATGCAGTTCT 
AAAAAGGGAT 



GGTGCCCATA 
CATGGGACAC 
GCTGCTGCAG 
TTTATTGCTA 
GTCGAGGAAC 
GCAACTGATT 
ATTGTCGATA 
GCGGATATTT 
GCCAGTGTTA 
GCTGAAGGTA 
ATCGAAAAAT 
GTGTTTATGA 
TATTTGCGTC 
GAAAATCTTC 
GCCAATTACT 
CAAATGATTG 
GGCTATTTTT 
ATTTGTCAGC 
GAAAATAAAG 
AAGGGAAGTG 
GGAGAGTTCA 
TAAAAGGGTC 
TCGGGCTATT 
AATTCGTGTT 
AGTGACTTCT 
TTCAATTGGT 
GATTGTCGGA 
GGTTACAGGG 
TCCTGTTCTA 



185 
TCCGTGGGGC 
GTCATCAGGT 
TTGGTAAAGG 
AGTTGGAAGA 
AGTCTAATTT 
TGCAACAACC 
CGATTTACGA 
CGACAACAAA 
AAGCGACCGA 
AAACTGAAAT 
TACGTAATTT 
ATATTTGGAC 
CTTTCTTTTT 
AATTTATTTT 
TTATGAAGTC 
GTTCTATTAA 
TGAGAAAAGA 
TTTCTTTTGA 
CTAGCCAAAG 
ATCGTTACAC 
ATGAGTAAGC 
AGTGTTCGTA 
AAAGAAGCTG 
AAATCCCAGA 
TCTGAGGTTC 
GCCATGACTG 
GACCGAACCC 
GGATTTCAGG 
AGAAGTAAGC 



AGGAACTAAT 
GATTCCAGAC 
AATTCGTATA 
AATGGGAGTG 
GAAAGCAATC 
GCTTACCCCT 
AAAACGTGTA 
TGGTCATATT 
CTTAAGAGCT 
TACCAATATC 
AGGAGCGGAT 
CAAATTAGCA 
TAGTGATAGT 
CCCAACGCAG 
CCCTTTGGGA 
ATTTGAGAAG 
TGCTTGGTCG 
GGAATTTGGC 
AGTTGCTCTT 
AAGAAAAATG 
ATCAGGAAAT 
GCATTTCGAA 
AAAACCGTGG 
AAGTTGCTAT 
TGGCTGGGCA 
AACAAAATAT 
GTATTCAGTT 
TTCATGATGA 
ATGATACCTT 



ATCATCATTA 
CGCATTGAAG 
AATAATGTTC 
AGAATGACTG 
AATATTAAGA 
CTTTTACTAA 
AATCATGTTT 
TTGTACACGG 
GGGGCTGCAC 
GAGTTTATCT 
ATTAGACTTG 
ATGTTTTCTT 
CAGGACTTCC 
GCAAGTCTGG 
GTGTGGGCAA 
TTAGATGAAA 
CAAGGATTTA 
TTAAAACAAT 
AAGTCTGGAT 
CGGGATTATC 
TCTAAGCTAT 
TCATCTAGGA 
AATTGTGGAG 
AGAGAGATTA 
AGAAGGTTTA 
CTTGTCTTAC . 
GCTAGCCTTG 
TGTGCTTAAA 
TACCGTCGCG 



3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 
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1S6 

ACCATGATCA ATAAAGCCTT GTCAAATGTC CAAATCAAGA CTGATATTCT GACAGTTGAG 5220 

AAACTTTATC GCCCTAGTCA TGAGTATGGT TTTCTGAGAG AGACAGATAC AGTTAAAGAT 5280 

TATTTGGACT TGGTTCGTAA GAATCGTAGC AGCCGTTTCC CTGTTATCAA TCAACATCAG 5340 

GTCGTTGTTG GTGTTGTAAC CATGAGAGAC GCTGGTGATA AATCACCAAG CACGACAATT 5400 

GATAAGGTTA TGTCTCGTAG TCTATTTTTG GTTGGATTAT CGACAAATAT TGCCAATGTG 5460 

AGTCAACGGA TGATCGCAGA AGACTTTGAA ATGGTACCAG TTGTTCGAAG CAATCAAACT 5520 

TTGCTTGGCG TTGTGACGCG ACGAGATGTC ATGGAGAAGA TGAGCCGTTC CCAAGTTTCG 5580 

GCTCTACCAA CTTTTTCTGA GCAGATTGGA CAAAAGCTCT CTTATCACCA TGATGAAGTA 5640 

GTCATTACAG TGGAACCCTT TATGCTAGAA AAAAATGGAG TTTTGGCTAA TGGTGTATTG 5700 

GCAGAAATTC TGACCCACAT GACCCGATTP AGTTGTTAAT AGTGGTCGCA ATCTCATTAT 5760 

CGAGCAGATG CTGATCTACT TTTTGCAGGC TGTTCAGATA GATGATATAT TGCGCATTCA 5820 

GGCACGGATT ATTCATCATA CGAGACGGTC AGCTATAATT GATTACGATA TTTATCATGG 5880 

TCACCAGATT GTTTCAAAAG CAAATGTGAC TGTTAAAATT AATTAGAAAC TAGGAGAAAA 5940 

GATGATAACA TTAAAATCAG CTCGTGAAAT CGAAGCTATG GACAAGGCTG GTGATTTTCT 6000 

AGCAAGTATT CATATAGGCT TACGTGATTT GATTAAGCCA GGCGTAGATA TGTGGGAAGT 6060 

TGAAGAATAT GTCCGCCGTC GTTGTAAAGA AGAAAATTTC CTTCCACTTC AGATTGGGGT 6120 

TGACGGTGCC ATGATGGACT ATCCTTATGC TACCTGTTGC TCTCTTAACG ATGAAGTGGC 6180 

TCACGCTTTC CCTCGTCATT ATATCTTGAA AGATGGTGAT TTGCTCAAAG TTGATATGGT 6240 

TTTGGGAGGT CCCATTGCTA AATCTGACCT AAATGTCTCA AAATTAAACT TCAACAATGT 6300 

TGAACAAATG AAAAAATACA CTCAGAGCTA TTCTGGTGGT TTAGCAGACT CATGTTGGGC 6360 

TTATGCTGTT GGTACACCGT CCGAAGAAGT CAAAAACTTG ATGGATGTAA CCAAAGAAGC 6420 

TATGTACAAG GGTATTGAGC AAGCTGTTGT TGGAAATCGT ATCGGTGATA TCGGTGCGGC 6480 

TATTCAAGAA TACGCTGAAA GTCGTGGTTA CGGTGTAGTG CGTGATTTGG TTGGTCATGG 6540 

TGTTGGCCCA ACTATGCACG AAGAACCAAT GGTTCCTAAC TATGGTATTG CAGGTCGTGG 6600 

ACTCCGTCTT CGTGAAGGAA TGGTCTTAAC CATTGAACCA ATGATCAATA CAGGCGATTG 6660 

GGAAATTGAT ACAGATATGA AAACTGGTTG GGCGCATAAG ACCATTGACG GTGGATTGTC 6720 

ATGTCAGTAT GAACACCAAT TTGTCATTAC GAAAGATGGA CCTGTTATCT TGACTAGCCA 6780 

AGGTGAAGAA GGAACTTATT AATAAAAAGT GAAAAGACTA CTGGAAGTTT ATTTTGATAA 6840 

AAAATCCAGT AGATCTTTTC ATAATAAAAC GCATTGTATC AAGTGTTAGG GGCTGATATC 6900 

ATGCGTTTTT CTGCTTTTAA GATTTTTTCC AACTCTGTTT GTAAGCGCAT CATAACAAAG 6960 
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GGTCTAGGAT TCAGGGCTCT CCTCCTATAT ACTATTAGTA AAGTAAAACT AAGGGAGGAT 7020 

ATTTTAGTGT CGCAGTCTAT TGTTCCTGTA GAGATTCCAC AATATTGTCG TTTTGATTCT 7080 

AAAAAGAGAA ATGGAATTCT GTTTAATGTT CGTATTGCCA ATCTTAAATT TACTTTTTTA 7140 

TATTATACTT CCTGCGAAAC AAAATATGGT ATAGTAGTTC TATGAATGAT GAAGCAAGTA 7200 

AACAACTAAC TGATGCACGA TTTAAGCGTC TTGTTGGTGT TCAGCGTACC ACTTTTGAAG 7260 

AGATGTTAGC TGTATTAAAA ACAGCTTATC AACTTAAACA CGCAAAAGGT GGACGAAAAC 7320 

CTAAATTAAG CCTAGAAGAC CTTCTTATGC CCACTCTTCA ATAGTGCGAG AATATCGAAC 7380 

TTATGAAGAA ATTGCGGCTG ATTTTGGTAT TCACGAAAGC AACTTTATCC GTCGGAGCCA 7440 

ATGGGTTGAA ATAACTCTTG TTCAAAGTGG TTTTACGGTT TCAAGAACTC CTCTCAGTTC 7500 

TGAGGACACG GTAATGATTG ATGCGACGGA AGTAAAAATC AATCGCCCTA AAAAAACAAT 7560 

TAGCGAATGA TTCTGGTAAA AAGAAATTTC ACGCTATGAA GGCTCAAGCG ATTGTCACAA 7620 

GTCAAGGGAG AATTGTTTCT TTGGATATCG CTGTGAACTA TAGTCATGAT ATGAAGTTGT 7 680 

TCAAAATGAG TCGTAGAAAT ATCGAACAAG CTGGTAAAAT CTTGGCTGAC AGTGGTTATC 7740 

AAGGGCTCAT GAAGATATAT CCTCAAGCAC AAACTCCACG TAAATCCAGC AAACTCAAGC 7800 

CGCTAACAGC TGAAGATAAA GCCTATAACC ATGCGCTATC TAAGGAAAGA AGCAAGGTTG 7860 

AGAACATCTT TGCCAAAGTA AAAACGTTTA AAATATTTTC AACAACCTAT CGAAATCATC 7920 

GTAAACGCTT CGGATTACGA ATGAATTTGA GTGCTGGTAT TATCAATCAT GAACTAGGAT 7980 

TCTAGTTTTG CAGGAAGTCT ATTGAGGTAT TGAGCTAGTT TATGAAAAAA TTGGGTGAAA 8040 

AGTCGAGTGT TTTAGAAACC CACAGTGTAG TATTCTAGTT TCAATCCACT ATATTTTGCT 8100 

ACTCCCCGTA AAGTTTCTAT TTTCCCTGAT TTCTGATATA ATAGAAATAT TGACTTCAAG 8160 

AGTAAGGAAG AGAAGATGAA CGCATTATTA AATGGAATGA ATGACCGTCA GGCTGAGGCG 8220 

GTGCAAACGA CAGAAGGTCC CTTGCTAATC ATGGCAGGGG CTGGTTCTGG AAAGACTCGT 8280 

GTTTTGACCC ACCGTATCGC TTATTTGATT GATGAAAAGC TGGTCAATCC TTGGAATATC 8340 

TTGGCCATTA CCTTTACCAA CAAGGCTGCG CGTGAGATGA AAGAGCGTGC TTATAGCCTC 8400 

AATCCAGCGA CTCAGGACTG TCTGATTGCG ACCTTCCACT CCATGTGTGT GCGTATTTTG 8460 

CGTCGCGATG CGGACCATAT TGGCTACAAT CGTAATTTTA CAATTGTGGA TCCTGGTGAA 8520 

CAGCGAACGC TCATGAAACG TATTCTCAAA CAGTTGAACT TGGACCCTAA AAAATGGAAT 8580 

GAACGAACTA TTTTGGGGAC CATTTCCAAT GCTAAGAATG ATTTGATTGA TGATGTTGCT 8640 

TATGCTGCCC AAGCTGGCGA TATGTATACG CAAATTGTGG CCCAGTGTTA TACAGCCTAT 8700 
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CAAAAAGAAC TTCGTCAGTC TGAATCCGTT GACTTTGATG ATTTGATTAT GCTGACCTTG 8760 

CGTCTCTTTG ATCAAAATCC TGATGTTTTG ACCTACTACC AGCAAAAATT CCAATACATC 8820 

CACGTTGATG AGTACCAAGA TACCAACCAC GCTCAGTACC AATTGGTCAA ACTCTTGGCT 8880 

TCCCGTTTTA AAAATATCTG TGTGGTTGGG GATGCGGACC AGTCTATCTA CGGTTGGCGT 8940 

GGTGCTGATA TGCAGAATAT CTTGGACTTT GAAAAGGATT ACCCCAAAGC CAAGGTTGTT 9000 

TTGTTGGAGG AAAATTACCG CTCAACCAAA ACCATTCTCC AAGCGGCCAA CGAGGTTATT 9060 

AAAAATAATA AAAATCGCCG TCCTAAAAAT CTCTGGACTC AAAACGCTGA TGGGGAGCAA 9120 

ATCGTTTACT ATCGTGCCGA TGATGAGCTG GATGAGGCTG TATTTGTAGC CAGAACCATC 9180 

GATGAACTTA GTCGCAGTCA AAACTTCCTT CATAAGGATT TTGCAGTTCT CTATCGGACT 9240 

AATGCCCAGT CCCGTACAAT TGAGGAAGCC CTGCTCAAGT CTAACATTCC TTATACCATG 9300 

GTTGGCGGAA CCAAATTCTA CAGCCGTAAG GAAATTCGCG ATATTATTGC TTATCTCAAC 9360 

CTTATTGCTA ATTTGAGTGA CAATATTAGT TTTGAGCGTA TTATCAACGA GCCTAAACGT 9420 

GGAATTGGTC TAGGTACAGT TGAGAAAATC CGTGATTTTG CAAATTTGCA AAATATGTCT 9480 

ATGCTGGATG CTTCTGCTAA TATTATGTTG TCTGGTATCA AGGGTAAGGC AGCCCAATCT 9540 

ATCTGGGATT TTGCCAATAT GATGCTTGAT TTGCGGGAGC AGCTAGACCA CTTAAGCATT 9600 

ACAGAGTTGG TTGAGTCCGT CCTAGAAAAA ACAGGTTATG TCGATATTCT TAACTCCCAA 9 660 

GCGACTCTAG AAAGCAAGGC ACGGGTTGAA AATATCGAAG AGTTTCTTTC TGTTACGAAG 9720 

AACTTTGATG ACACCACGGA TGTGACAGAA GAGGAAACTG GTCTGGACAA ACTGAGTCGT 9780 

TTCTTAAATG ACTTGGCTTT GATTGCCGAC ACAGATTCAG GTAGTCAGGA GACATCAGAA 9840 

GTGACCTTGA TGACCCTGCA TGCTGCCAAA GGTCTCGAAT TTCCAGTTGT CTTTTTGATT 9900 

GGGATGGAAG AAAATGTCTT TCCACTTAGT CGTGCGACTG AAGATTCAGA TGAATTAGAA 9960 

GAAGAGCGCC GTCTAGCCTA TGTAGGTATC ACGCGTGCAG AGAAAATTCT CTATCTGACC 10 020 

AATGCCAACT CACGCTTGCT TTTTGGTCGT ACCAATTATA ACCGTCCGAC TCGTTTTATT 10080 

AACGAAATCA GTTCAGACTT GCTTGAGTAT CAAGGTCTGG CTCGTCCTGC AAATACAAGC 10140 

TTTAAGGCAT CATATAGCAG TGGTAGTATT TCCTTTGGTC AAGGTATGAG TTTGGCTCAG 10200 

GCTCTTCAAG ACCGTAAACG CGGTGCTGCC CCAAAATCAA TCCAGTCAAG CGGTCTTCCA 10260 

TTTGGTCAAT TTACAGCTGG CGCAAAACCA GCATCTAGCG AGGCAAATTG GTCCATTGGT 10320 

GATATTGCTC TCCACAAGAA ATGGGGAGAG GGAACCGTTC TGGAAGTTTC AGGTAGCGGT 10380 

GCTAGGCAGG AATTGAAAAT CAATTTCCCA GAAGTAGGTT TGAAAAAACT TTTAGCCAGT 10440 

GTGGCTCCAA TTGAGAAAAA AATCTAATTT TCCATCCTTC TCACGAATAA TAAAGTGAGG 10500 
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AGGATTTTTA 


TGTACAGTAT 


TTCATTCCAA 


GAAGATTCAC 


TATTACCAAG 


AGAAAGGCTG 


10560 


GCCAAGGAAG 


GAGTTGAAGC 


GCTTAGTAAC 


CAAGAGTTGC 


TAGCTATTTT 


ACTCAGGACA 


10620 


GGAACACGTC 


AAGCTAGCGT 


TTTTGAAATT 


GCCCAAAAAG 


TCTTGAACAA 


TCTTTCAAGC 


10680 


C T AACGG ATT 


TGAAAAAAAT 


GACCCTGCAG 


GAATTGCAGA 


GTTTGTCTGG 


TATTGGGCGT 


10740 


GTTAAGGCCA 


TAGAATTACA 


AGCTATGATT 


GAACTGGGGC 


ATCGTATTCA 


CAAACACGAG 


10800 


ACTCTTGAAA 


TGGAAAGTAT 


TCTCAGCAGT 


CAAAAGTTGG 


CCAAGAAGAT 


GCAGCAGGAA 


10860 


TTAGGGGATA 


AAAAACAAGA 


GCACCTGGTG 


GCACTCTATC 


TCAATACTCA 


AAATCAAATC 


10920 


ATCCATCAGC 


AGACCATTTT 


TATCGGGTCT GTAACTCGTA GTATCGCTGA ACCGCGAGAG 


10980 


ATTCTTCACT 


ATGCAATCAA 


G<JA 1 A TGGCG 


ACTTCTCTTA 


TCTTGGTCCA 


CAATCATCCT 


11040 


TCAGGAGCGG 


TAGCGCCTAG 


LLAAAAiuAl 


GATCATGTCA 


CTAAACTTGT 


TAAAGAAGCC 


11100 


TGCGAATTGA 


TGGGGATTGT 


i 1L1 1 uunl. 


PITTWITTIP 
LM1 1 1 VjA 1 l\j 


TCTCTCATTC 


TAATTACTTT 


11160 


AGTTATCGTG 


AAAAGACAGA 


" i » i » i • A A rp/~*rp j\ R 
X i J. M-rt. I V_ -I r\J\ 




L GACAT AGTC 


AAAGAGTTTT 


11220 


TTATCTTTGG 


GACGATTTTC 


AAAAAGAAGT 


TCTGGATGCC 


ATTGGACACC 


GAGAAAGGCG 


11280 


ACATCATCCG 


TACTCATGAC 


AGCCTCAATG 


ATACCATCTT 


TAGGATCATG 


AGCCACAACT 


11340 


TTTAAATTTG 


GTGCTAAGTC 


CTTGATGCTC 


TGGTGGTGGA 


AGGAGTTGAT 


ATGAGAGATT 


11400 


TCTCCATAGA 


TTTCTTGGAG 


AACGGTATCT 


GGTTCTGTTA 


CCAAGCGTTG 


AGTTGTGTAC 


11460 


TCAACAGAAG 


AATCCTGCCA 


ATGGTCTTCG 


ATATCTTGGT 


ACAAAGTTCC 


ACCCATGGCA 


11520 


ACGTTAAAGA 


GTTGGGTACC 


ACGGCAGACA 


GAGAAAATGG 


GCTTTTTCTG 


TTTAATAGCT 


11580 


TCCTTGATGA 


GGGCCAGTTC 


GAAGATATCT 


CTTTGAAGGT 


GATAGTCATC 


ACTATCAATG 


11640 


GTTTTGGGTT 


CGCCATAAAA 


TTTTGGATCG 


ACATTTTGCC 


CACCTGTCAA 


GATGAGCTTG 


11700 


TCAATCAAAC 


TGATATAGTG 


GCAGGCCATT 


TCTTGATCAC 


CAATCGGTAG 


GATGATGGGA 


11760 


ATCCCTCCAG 


CATCTTTAAC 


GCCTTCAACA 


AAGCCTTTTG 


CTGCGTAGCT 


CATCATGATG 


11820 


TCATCATCTG 


GATGAGTTTT 


TTCGTTTCCT 


GTAATCCCAA 


TAACTGGTTT 


TTTCATAAAA 


11880 


TGATTTTCGC 


TTTCTAATCC 


TCTTTTCGCA 


TGAAGTAGAG 


GAGGGTTTGG 


AGTTCACTTG 


11940 


TCAAATCGAC ATACTGAACG 


ACCACGTCTT 


TTGGTAAATG 


CAGATGGACT 


GGTGAAAAAC 


12000 


TGAGAATTCC 


TTTCACACCA 


GCATCAACCA 


AGAGATTAGC 


AACCTCTTGT 


GACTTGACGC. 


12060 


TGGGAACAGT 


TAGGATAGCA 


GTCTTCACAT 


CAGCATCCTT 


GATTTTATCC 


TTGATCTGAG 


12120 


AAATCCCGTA 


AATGGGAATC 


CCGTCAGGAG 


TTTGGGTACC 


GACTTCAGGA 


TGGTCGTCTA 


12180 


GGTCAAAGGC 


CATGATAATC 


TTCATCTTGT 


TACGTTCGTG 


GAAGCGGTAG 


TGGAGAAGGG 


12240 



WO 98/18931 



PCT/US97/19588 



CATGGCCCAT 
AATCGGCAAA 
CACCAAAATA 
TTTGCTTAGA 
AGAGAGAGAG 
CACAACCTTT 
AAAAACTAAG 
AGGTCTCCGA 
AGTGTATCTG 
TACTTGAGAG 
CTATCTTTCC 
TTTTTCCTGA 
CTTGTCAGTG 
GAAGCCTTCA 
CTTAGTTTCA 
GTAATCGTAA 
CGATAATTTC 
ACAAACTGAT 
GAACATGGCT 
CAGCTAAATC 
GAACATAAGG 
CCAGAAAGGC 
CTTCTTTTAG 
TTTTCTGTTC 
GAAAACCGAT 
CCAATTCTTC 
CAGCAGGGTG 
CATTATTGAC 
GCTGGGCTGT 
GAACCAAGTA 



ATTTCCAATA 
AAATGTCATT 
GGAAAAATCA 
GTTGGCACGT 
TCTTTTTGCT 
CTATTCTTCT 
AAAAATCTTA 
CCAGCCCCTG 
GTAAGGTTAC 
TACGCTCTAC 
GATTTTGTAA 
TGAATATGGT 
ATATTGATTT 
CCGCTGTTTG 
AAAAAGGTGT 
CGACAATTTT 
TTTTAAGGTT 
GCGAAGGGAT 
GGATTGGACA 
TAGCCGAAGG 
GAGATGATGT 
AGTTTCTAGA 
GGCTGCAACC 
CTGGTCTCCG 
TCCCTTAGGA 
TGAATGAATT 
TTGCTTGAGT 
AAACATGGTA 
GATTTCTTGA 
ATCAATTGTT 



CCAACCAGCA 
AGTTTTTTGA 
CGACGTACGG 
TCAATCTTTT 
GTAGCTTTTG 
ATTTTATAGA 
GTTTTGATGT 
ATAAACTTTT 
ACATCCTGAC 
ATGATAGCAG 
AGACACCACG 
GGTCTTCTGA 
TAGCCCCTGT 
GCACTTTTTC 
TATCTTTGAG 
TTAACTGAAT 
TTTGCGAGGG 
TCCTTCAAGC 
ACGCCTGCAG 
AGTAAGAGGT 
TTTCCTCTAT 
TTTTGTACAT 
ATGCCTACAA 
CCATGTAGAT 
CCATGGAATT 
GGGATTTTAC 
ATTTGGCCAA 
GAAACCAAAA 
TTTTCTGGCT 
TCAAGGACAG 



190 
TGACATTGGT 



CATCATAGCC 
TCGCTGAATC 
CTGCATGAAA 
GAATAGCAAA 
AACATTGTGA 
AAAAAATCTG 
TTGCCCCTAA 
AAAGTCAACA 
TCCTTATAGG 
TTCTACCAAG 
TTTGAAAATA 
AAGTTCCTTG 
CAAAAGGCGA 
GGTGAATTTT 
GATTTTTTCA 
TTTGTAGGTC 
GTTCTGAATT 
TACAGGCTGA 
CATTTTTCTG 
TCAGGTAATA 
GTTGAAAATG 
TGGCAGGCAG 
AGGAATCAAA 
TGTGGGCAGA 
CAATAGCCTG 
TTTCAGCGAT 
TCGTATCGTC 
GGATAATGGT 
CATGGTGCTC 



AATAGAGTTG 
AAAACCACGA 
AATACCGATA 
TCTCTTAAAA 
CTCTTTATCT 
AAAAATCAAC 
CATGAGATAG 
AAGTCAGAGA 
TGAGAGCCTC 
TCAGTTCAAA 
CTATCCATGA 
TCAACTAGAC 
TTAATGATGA 
GTCAGTTCAT 
TTAACAGAAG 
AATGCCATAT 
TTCAACGGTA 
TGCGCCATAC 
GCCAGTAGAG 
ACCAGGAAAT 
CTGAATGCCC 
TTCTTCTTGT 
ATTTTCAGTT 
GTCCATGCTA 
AGCAGTGAGA 
AACTGCATCA 
GGGCAGTAGG 
ACGTAAAGCC 
TGCTTCAAAC 
GATGGCAGTT 



TCATTGAGCA 
CGACCAAGTT 
GCCTCTGCAA 
ATTCGATAGT 
TTCACAAAAT 
AAAAATAAGA 
AAAACGGTAG 
AGTCACATAA 
ATGATCCTCA 
CATTTTGGCT 
GGAAGTAGAA 
GAAGGCCAAA 
TTTTGAGTTG 
AGTTACCAAC 
GGCTAAGAGT 
GGCTAACCTC 
TTTTGTGGCG 
ATGGCTTCAA 
ATTGAAATTC 
CCAATATTGA 
TCCAGCTCTG 
TTTTCTAGGT 
CCTGCACGTT 
GATGCGTAGA 
AAATCAATGC 
ACATGATAGG 
TTTCCTGTCT 
TTTTGAATTT 
CCAAAGTGTT 
GTGATGATAT 



12300 

12360 

12420 

12480 

12540 

12600 

12660 

12720 

12780 

12840 

12900 

12960 

13020 

13080 

13140 

13200 

13260 

13320 

13380 

13440 

13500 

13560 

13620 

13680 

13740 

13800 

13860 

13920 

13980 

14040 
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GTTTTCCTTG 
TCCCACCAGA 
GACGGGCTTC 
TTCCGTGGGT 
TCGTTGCAGC 
AGAGTGGGCT 
TAGCAGTGAT 
CAAGAATTTC 
CGTGGCTAGA 
CAGAGAAGGT 
CATCACCAAT 
CGATAGGAGC 
GGCTAACGAC 
GGGGAACAGT 
GCAAATCAAG 
TTGCTGTAAG 
GAAGGACAAC 
ATTCCATTAG 
CACGGACACT 
GTTTTCCAAG 
TGAGTGCGAA 
TATAAATCCT 
TTCTATTTTA 
ACCAATTTTG 
TCCTGTTTGC 
AAGTATCGGC 
GTTGTTAGAC 
TCCAGTACTA 
TGCAATTTTT 



TTCTTGGTGA 
AGTGAAAAAG 
TCGCAAGAGT 
TTCTTGCATA 
ATTGTCCAAA 
GACTGGTTTT 
GTAGCATACA 
TTTAATATTA 
AACAGCATAA 
ACGTCCTGTA 
AATATAACCT 
ATCAAGATAT 
AACAACATCT 
GAAAAGATTA 
AGTCAGGATA 
TGGCTCACGA 
GTTGATACTG 
GTGGTTGTTG 
TTCTTCGATA 
TGGGACACCA 
AAGTTTCATG 
AGTTATATTT 
CCAAAAAATG 
AAGGAGCTTT 
CTTGCTCATG 
TCTGATCCAG 
AACCAAGAAC 
GCCCTTGAGG 
GGTTGTATTC 



CGAAGACAGT 
ATATGTTGAG 
TTGCCAGCTT 
ACCTTGGTCA 
TAAATCAAAG 
CTTTCGTGAA 
TTTTTAGGAG 
GTATTGTCAA 
ATTTCTGTAG 
TTTAAAATAT 
TCGTTACGAG 
TCAGCCAGGC 
GAACCAAGCA 
TCCACTGGAA 
CGATCAACTC 
GGACAAGCAA 
TGGGCACTTG 
ACAGGGAAAC 
TTTACTTGGA 
ACAGCTTGGG 
TTTTTTCTAT 
ACCTTACATA 
GAGATTATTT 
TTGATAGGAA 
ATTTTCCACT 
TGCCATGAGA 
CTGCCAGTTC 
AAGTTCTTCC 
CATGTTTCCA 
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AGCCAATGAT 
GTTTTGTCCT 
GACGACCATG 
TAGCTGAAAT 
AATCACCTTA 
TACGGACGAT 
TTTTTTCTTT 
GAAGCTCAGC 
CTCCTTCACG 
CATCAATCAA 
TTGCATCGTC 
TACGCGCACG 
ATCCTTTATC 
TATCAAAGAA 
CAGCCTTAAC 
TGCGGTCTTG 
CACGCACACA 
TTGTTGATTG 
TTTCTCCGTC 
CAATTTTTTG 
CTGACATTAT 
TATGAACTGG 
CAGCTATTTT 
ATCTGATTTT 
TCAAGCTCCA 
CCAATAGCTG 
TTACTTTGGA 
TTACTCAGAT 
ACACTCTGCG 



GGTAGTATTA 
TAGTAACTGG 
ACCATGAATA 
AGCAACTGCT 
TTTCTTTTTA 
AGCATCACCA 
TGTTGCTACT 
AGCTCCCTCG 
TTCAACGATT 
GATAGCTTTC 
TTGAGGGTAG 
TTTGACACCT 
GCAGTAATGT 
ACCTTGAACC 
CAGCATATTG 
ACGTGCATAG 
AGCATCGACC 
GATGATGTAA 
TGAAAATTGA 
TGCAATCTCT 
AGACCGTCCT 
GATTTGTGTA 
TCATACTTTT 
TCTCTAAAAA 
ATTCGTAATC 
TTTTCATTTC 
TACCATGTTT 
AGTTCTCAGC 
GGACTTTGAG 



TTGCCTTCAG 
GCTAGTTCCT 
CTAGAAGGAT 
GACATAGGAG 
TTGTAGGCAA 
ATTAACTCAC 
GAATCAGTCA 
ACGAAGAGAC 
TTAGAAGCTT 
TTACCTTCAA 
TCGATAATGG 
GAATTTTTAG 
TTTGCGAATA 
TGAACGGCAT 
GCAACTAGTT 
CCAAAATATG 
ATGATTAACA 
ACATCATAAC 
CGTGATGATA 
TGGTTAGAGT 
CTGTAAACTT 
TTTTTATCTT 
GACAAATCGA 
TTGTCGAAAA 
TGTTATATCA 
ATAGCGAAGC . 
CGCCAATTCA 
ATCTTTTAGT 
TGTCAACTCA 



14100 

14160 

14220 

14280 

14340 

14400 

14460 

14520 

14580 

14640 

14700 

14760 

14820 

14880 

14940 

15000 

15060 

15120 

15180 

15240 

15300 

15360 

15420 

15480 

15540 

15600 

15660 

15720 

15780 
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GCCCAGTCTT CAAAGGTTCG AATGCGCATA GCGACTTTCT TTTCTCGCAG 


TTCAAAATCA 


15840 


GGCGTGTCGA TGTAGTAATT TGTTTGAAGA ACAGGAGTGA CACCTGTGAA 


CTGGTCTTTT 


15900 


AGACGATTGT ATTCATCTTT TTTCAATAGT GTTTTCAATT 


CAATTTCTAA 


ATGTTTCATT 


15960 


TTTCTTACCT TTTTTTATCG TTGAAAGCGG ATTTATGGTA 


TAATAAGCAT 


TGTATTTATT 


16020 


GTATATGAAT CTGGAGAAAA AATCAAAGAT ATTTTTGACG 


GATAATATGA 


GAACAAGGGA 


16080 


GAATATATGA CCTTAGAATG GGAAGAATTT CTAGATCCTT 


ACATTCAACC 


TGTTGGTGAG 


16140 


TTAAAGATTA AACTTCGTGG TATTCGTAAG CAATATCGTA 


AGCAAAATAA 


GCATTCTCCA 


16200 


ATTGAGTTTG TGACCGGTCG AGTCAAGCCA ATTGAGAGCA 


TCAAAGAAAA 


AATGGCTCGT 


16260 


CGTGGCATTA CTTATGCGAC CTTGGAACAC GATTTGCAGG ATATTGCTGG 


CTTACGTGTG 


16320 


ATGGTTCAGT TTGTAGATGA CGTCAAGGAA GTAGTGGATA 


TTTTGCACAA 


GCGTCAGGAT 


16380 


ATGCGAATCA TACAGGAGCG AGATTACATT ACTCATAGAA AAGCATCAGG CTATCGTTCC 


16440 


TATCATGTGG TAGTAGAATA TACGGTTGAT ACCATCAATG 


GAGCTAAGAC 


TATTTTGGCA 


16500 


GAAATTCAAA TTCGTACTTT GGCCATGAAT TTCTGGGCAA CGATAGAACA TTCTCTCAAC 


16560 


TACAAGTACC AAGGGGATTT CCCAGATGAG ATTAAGAAGC 


GACTGGAAAT 


TACAGCTAGA 


16620 


ATCGCCCATC AGTTGGATGA AGAAATGGGT GAAATTCGTG 


ATGATATCCA 


AGAAGCCCAG 


16680 


GCACTTTTTG ATCCTTTGAG TAGAAAATTA AATGACGGTG 


TAGGAAACAG 


TGACGATACA 


16740 


GATGAAGAAT ACAGGTAAAC GAATTGATCT GATAGCCAAT 


AGAAAACCGC 


AGAGTCAAAG 


16800 


GGTTTTGTAT GAATTGCGAG ATCGTTTGAA GAGAAATCAG 


TTTATACTCA 


ATGATACCAA 


16860 


TCCGGATATT GTCATTTCCA TTGGCGGGGA TGGTATGCTC 


TTGTCGGCCT 


TTCATAAGTA 


16920 


CGAAAATCAG CTTGACAAGG TCCGCTTTAT CGGTCTTCAT 


ACTGGACATT 


TGGGCTTCTA 


16980 


TACAGATTAT CGTGATTTTG AGTTGGACAA GCTAGTGACT 


AATTTGCAGC 


TAGATACTGG 


17040 


GGCAAGGGTT TCTTACCCTG TTCTGAATGT GAAGGTCTTT 


CTTGAAAATG 


GTGAAGTTAA 


17100 


GATTTTCAGA GCACTCAACG AAGCCAGCAT CCGCAGGTCT 


GATCGAACCA 


TGGTGGCAGA 


17160 


TATTGTAATA AATGGTGTTC CCTTTGAACG TTTTCGTGGA 


GACGGGCTAA 


CAGTTTCGAC 


17220 


ACCGACTGGT AGTACTGCCT ATAACAAGTC TCTTGGCGGT GCTGTTTTAC ACCCTACCAT 


17280 


TGAAGCTTTG CAATTAACGG AAATTGCCAG CCTTAATAAT CGTGTCTATC GAACACTGGG 


17340 


CTCTTCCATT ATTGTGCCTA AGAAGGATAA GATTGAACTT ATTCCAACAA 


GAAACGATTA 


17400 


TCATACTATT TCGGTTGACA ATAGCGTTTA TTCTTTCCGT 


AATATTGAGC GTATTGAGTA 


17460 


TCAAATCGAC CATCATAAGA TTCACTTTGT CGCGACTCCT AGCCATACCA GTTTCTGGAA 


17520 


CCGTGTTAAG GACGCCTTTA TCGGCGAGGT GGATGAATGA 


GGTTTGAATT 1 


TATCGCAGAT 


17580 
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GAACATGTCA AGGTTAAGAC CTTCTTAAAA AAGCACGAGG TTTCTAAGGG ATTGCTGGCC 17640 

AAGATTAAGT TTCGAGGTGG AGCTATTCTG GTCAATAATC AACCGCAAAA TGCAACGTAT 17700 

CTATTGGACG TTGGAGACTA CGTTACCATT GACATTCCCG CTGAGAAAGG CTTTGAAACC 177 60 

TTGGAGGCTA TTGAGCTTCC ATTAGATATT CTCTATGAGG ATGACCACTT TCTAGTCTTG 17820 

AATAAACCCT ATGGAGTGGC TTCTATTCCT AGTGTCAATC ACTCTAATAC CATTGCCAAT 17880 

TTTATCAAGG GTTACTATGT CAAGCAAAAT TATGAAAATC AGCAGGTTCA CATTGTTACC 17940 

AGACTAGATA GGGATACTTC TGGCTTGATG CTCTTTGCCA AGCACGGTTA TGCCCATGCA 18000 

CGATTAGACA AGCAGTTGCA GAAGAAATCT ATCGAGAAAC GCTACTTTGC TTTGGTTAAG 18060 

GGAGATGGAC ATTTGGAGCC AGAAGGGGAA ATTATTGCTC CGATTGCGCG TGATGAAGAT 18120 

TCCATTATTA CCAGACGAGT GGCTAAAGGC GGAAAGTATG CCCATACTTC ATACAAGATT 18180 

GTAGCTTCTT ATGGAAATAT TCACTTGGTC TATATTCACC TGCACACTGG TCGAACCCAT 18240 

CAAATCCGAG TCCATTTTTC TCATATCGGT TTTCCTTTGC TGGGAGATGA TTTGTATGGT 18300 

GGTAGTCTGG AAGATGGTAT TCAACGTCAG GCTCTGCATT GCCATTACCT ATCCTTTTAT 18360 

CATCCATTTT TAGAGCAAGA CTTGCAGTTA GAAAGTCCCT TGCCGGATGA TTTTAGTAAC 18420 

CTTATTACCC AGTTATCAAC TAATACTCTA TAAAAACTGT CTCAGAGTAT AATTATTATC 184 80 

TTAAAGGAGA AAACTCATGG AAGTTTTTGA AAGTCTCAAA GCCAACCTTG TTGGTAAAAA 18540 

TGCTCGTATC GTTCTCCCTG AAGGGGAAGA GCCTCGTATT CTTCAAGCAA CAAAACGCTT 18600 

AGTAAAAGAA ACAGAAGTGA TTCCTGTTTT GCTTGGAAAT CCTGAAAAAA TTAAAATTTA 18660 

TCTTGAAATT GAAGGAATCA TGGATGGTTA TGAGGTCATC GACCCTCAAC ATTATCCTCA 18720 

ATTTGAAGAA ATGGTTTCTG CCTTGGTGGA GCGTCGCAAG GGCAAAATGA CTGAAGAAGA 18780 

TGTACGCAAG GTTTTGGTTG AAGATGTCAA CTACTTTGGT GTGATGTTGG TTTACTTGGG 18840 

CTTGGTTGAT GGAATGGTGT CAGGAGCGAT TCACTCAACA GCTTCAACAG TTCGCCCAGC 18900 

TCTACAAATC ATCAAAACTC GTCCAAATGT AACTCGTACT TCAGGAGCCT TCCTCATGGT 18960 

TCGTGGTACG GAACGTTACC TATTTGGAGA CTGTGCCATT AACATCAATC CAGATGCAGA 19020 

AGCCTTGGCT GAAATTGCCA TCAACTCAGC AATCACAGCT AAGATGTTTG GCATCGAACC 19080 

TAAAATTGCC ATGTTGAGCT ATTCTACTAA AGGTTCAGGG TTTGGTGAAA GCGTTGATAA 19140 

GGTCGTTGAA GCAACTAAAA TTGCTCACGA CTTGCGTCCT GACCTTGAAA TCGATGGTGA 19200 

GTTGCAATTT GATGCAGCCT TTGTTCCTGA AACTGCAGCT CTGAAAGCTC CTGGAAGTAC 19260 

GGTAGCTGGT CAAGCAAATG TCTTCATCTT CCCAGGTATC GAGGCAGGAA ATATTGGTTA 19320 
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CAAGATGGCT GAACGCCTGG GTGGCTTTGC GGCTGTAGGA CCTGTTTTGC AAGGTTTAAA 
CAAGCCAGTT AATGATCTTT CTCGTGGATG TAATGCAGAT GATGTTTACA AGTTGACCCT 
CATCACAGCA GCTCAAGCAG TTCATCAATA GTGAAAACTA TAAAGTGATA TACTATGCTA 
TACTGTAGTT ATGAAACTAT GTACGAAAAG CACTGCCATT AATTCCTGAG AACTAAATTA 
CTGATTGGTG TCAAAAAGGA AAACTTCCAA GCGATGATAT CCTGTCTATA CACGACCTAT 
AGAAATCTGT AATATACATA TCCGTAAAAC GATAAATTCC CTTTTTGATT TTAAATGAGT 
ATGAAAAGAG AATTTTTTGG CTCTTTGTCA ACTGTAGTGG GTTGAAGAAA AGCTAAGCTC 
GAGAAAGGAC AAATTTCATC CTTTCTTTTT TGATATTCAG AGCGATAAAA ATCCGTTTTT 
TGAAGTTTTC AAAGTTCCGA AAACCAAAGG CATTGCGCTT GATAAGTTTG ATGAGATTAT 
TGGTCGCTTC CAGTTTGGCG TTAGAATAGT GTAGTTGAAG GGCGTTGATA ATCTTTTCTT 
TATCTTTGAG GAAGGTTTTA AAGACAGTCT GAAAAATAGG ATGAACCTGC TTAAGATTGT 
CCTCAATAAG TCCGAAAAAT TTCTCTGGTT CCTTATTCTG GAAGTGAAAA AGCAAGAGTT 
GATAGAGCTG ATAGTGGTGT TTCAAGTCTT CCGAATAGCT CAAAAGCTTG TTTAAAATCT 
CTTTATTGGT TAAGTGCATA CGAAAAATAG GACGATAAAA TCGCTTATCA CTCAGTTTAC 
GGCTATCCTG TTGAATGAGT TTCCAGTAGC GCTTGATAG 
(2) INFORMATION FOR SEQ ID NO; 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19702 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



19380 

19440 

19500 

19560 

19620 

19680 

19740 

19800 

19860 

19920 

19980 

20040 

20100 

20160 

20199 



fxi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
ACCCGATGTA TCAGCGGATA TTTACTCTAT TTTTCAAACG ATGTTATACC CACAATAAAA 
GAAAAAAGAC CCTAAGGTCT CCTTTGCTTT TATTATTAAA CGCGTTCAAC TTTACCTGAT 
TTCAAAGCAC GAGCTGAAGC CCAAACTTTT TTAGGTTTAC CATCGATAAG AACAGTAACT 
TTTTGAAGGT TTGGTTTTAC GGCACGTTTT GTTTGGTTCA TCGCGTGTGA ACGGTTGTTT 
CCTGATACAG TCTTACGACC TGTAAAGTAA CATACTTTAG CCATTGTGTT TTCCTCCTAT 
TAGATCTAAT ATAGCGGATG TGCTAGCACC ACATACCGTA CTATGTTATC ACATTTTCTT 
GTTTTTTGCA AGGGAATTGG AAGATTTTTT ATTTGTGTCT TAAATCAGGT CTTGCGTGAC 
ATTTcTGCTC TCCACATGCC ATCGTTGATT AACAGAACAC CAGAATTAAA ATTATGTGTA 
TAAAAATCAT CTCTAACTGC AGCTAAGGGT ATAGCCGTCA AGTCCAAATC CCACAGCTCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
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TCTATCGATT TTCTTACAAC AATATCTGAA TCCAAATACA GTACACGAGA CTCGCTTACA 600 

TACTTTGGAA TAAAATACCT AAAAAAGCCG CATATGAAAG TCCCTCAAAG GGGAGACGAT 660 

AACCTTTCAG AATATTACTG TCAATCTAAA CATTCACAAT CTCACTATTC AAAGTCTCTA 720 

GTCTTTTTTC CATCAATTGG AACCATTCTC GCGGAAGGTC ATCATTAAAA ACATAAAACT 7 80 

TAAGATTATA ATGATGAACA CAAAGAGATT TTATTGTTGT TTCAACTTTA TCCATATAAG 840 

CATTATCTGC ACCTAAGACA ATCGCTTTTT TCTCTTCTTT CACTTTTTAT CTCATTTCTT 900 

TTTATTCCCA TCATATTATT CCCATCATAT GTTTCCCATC ATATGTTTCT ACGTAACCAT 960 

TATTTTCGCC TATTCGTTCG TAAAACCATA CCAGTGGAGA TTTTAGATGA AGTCCCATTA 1020 

CGGTTTACAA TTTTTACATT ACGACACGGA GTTTTACAAA TCGATTTCAT TTGCCAAACG 1080 

TAGTTAGTGA GGCAGTTAGC TAGTTCGCCA AATAGCGACT AGCGTCCAAC AATTTGGAAC 1140 

TTTAGTTCCA ATTGTTGGTA CTGAGTCACA TCTTCTCCTC TAACTCTACG TCTGGATACT 1200 

TGTCCGCAAA CCAGCGGAGG GCAAAGTCAT TTTCAAAGAG AAAGACTGGT TGGTCAAAAC 1260 

GGTCTTTGGC TAAGATATTG CGACTTGACG ACATCCGTTC ATCCAAGTCC TCAGGCTTGA 1320 

TCCAACGAAC GGTCTTTTTA CCCATTGGGT TCATAACTAC TTCCGCATTG TACTCGCCTT 1380 

CCATGCGGTG TTTAAAGACT TCAAACTGGA GTTGACCTAC AGCGCCTAGC ATGTACTCAC 1440 

CTGTTTGGTA ATTCTTATAA AGCTGAACGG CTCCTTCTTG CACCAATTGC TCAATCCCCT 1500 

TGTGGAAGGA TTTTTGCTTC ATAACATTCT TAGCAGAAAC TTTCATGAAA ATCTCAGGTG 1560 

TAAAGGTTGG CAGGGGTTCA AATTCAAACT TGTTTTTTCC AACCGTCAAG GTATCCCCAA 1620 

CCTGATAAGT ACCGGTATCG TAAACCCCGA TAATATCACC TGCCACGGCA TTGGTCACAT 1680 

TCTCACGACT CTCCGCCATA AACTGGGTAA CATTAGATAG TTTAGCCCCC TTACCAGTAC 1740 

GAGGGAGATT GACACTCATG CCGCGCTCAA ATTCGCCAGA TACGATACGG ACAAAGGCAA 1800 

TACGGTCACG GTGACGAGGG TCCATGTTGG CTTGGATTTT AAAGACAAAG CCTGAGAAAT 1860 

CCTTGTCATA AGGATCCACA ATTTCACCGT CTGTTTTCTT GTGACCATGT GGTTCTGGAG 1920 

CAAACTTGAG GAAGGTTTCA AGGAAGGTCT GCACACCAAA GTTTGTCAGG GCTGAACCGA 1980 

AAAAGACAGG CGTCAATTCT CCAGCCAGAA TAGCTTCCTC TGAAAACTCA TTCCCGGCTT 2040 

CATTTAAAAG CTCAATGTCA TCCTTGACTT GCTCGTAGAA AGGATTGCTA CCAAAGAGTT 2100 

TGTCCCCGTC TTCTAGACTG GCAAAACGCT CATCCCCTTT GTAAAGCTCT AAACGTTGGT 2160 

TATAGAGGTC ATACAAGCCC TCAAAGGCTT TCCCCATCCC GATAGGCCAG TTCATAGGGT 2220 

AGCTAGCAAT GCCCAAGATT TCTTCCAATT CTTGCAAGAG ATCCAAAGGC TCACGACCGT 2280 
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CACGGTCCAG CTTGTTCATA AAGGTAAAGA CTGGAATGCC ACGATGTTTC ACAACCTCAA 2340 

ACAATTTCTT GGTTTGAGCC TCGATCCCCT TGGCAGAGTC CACGACCATG ACCGCAGCAT 2400 

CCACCGCCAT CAAGGTACGA TAGGTATCTT CTGAGAAGTC CTCGTGCCCT GGCGTGTCTA 2460 

AGATATTCAC GCGCTTGCCG TCGTAGTCAA ATTGCATAAC AGATGAAGTA ACAGAAATCC 2520 

CACGTTGCTT CTCGATATCC ATCCAGTCAG ATTTAGCAAA AGTCCCTGTT TTCTTCCCTT 2580 

TTACCGTACC AGCCTCACGA ATCTCACCCC CAAAGTAGAG TAACTGCTCA GTGATGGTTG 2 64 0 

TTTTCCCCGC GTCCGGGTGG GAGATAATGG CAAAGGTACG ACGTTTCTTA ATTTCTTCTT 2700 

GAATATTCAT AAGTTCTCTT TCTTTGATTC TCTATTTTTC TTGTTTCAAT AGCTGAGAAT 2760 

GATTTTTACA TTGGATTTTA CCATTCCTTT CAACACTCCA TTATATCGGA TTTTAGCATT 2820 

TTTTTCAATT TCTATTTCTT TTCACTTCCC CCTCCCTTAT TTATAGGAAA ATATGGTAAA 2880 

ATAGAACAGA CTAAAAATCA TCATTTCACG AAAGGATGCA AGATGAAAAT TACGCAAGAA 2940 

GAGGTAACAC ACGTTGCCAA TCTTTCAAAA TTAAGATTCT CTGAAGAAGA AACTGCTGCC 3000 

TTTGCGACCA CCTTGTCTAA GATTGTTGAC ATGGTTGAAT TGCTGGGCGA AGTTGACACA 3060 

ACTGGTGTCG CACCTACTAC GACTATGGCT GACCGCAAGA CTGTACTCCG CCCTGATGTG 3120 

GCCGAAGAAG GAATAGACCG TGATCGCTTG TTTAAAAACG TACCTGAAAA AGACAACTAC 3180 

TATATCAAGG TGCCAGCTAT CCTAGACAAT GGAGGAGATG CCTAATGACT TTTAACAATA 3240 

AAACTATTGA AGAGTTGCAC AATCTCCTTG TCTCTAAGGA AATTTCTGCA ACAGAATTGA 3300 

CCCAAGCAAC ACTTGAAAAT ATCAAGTCTC GTGAGGAAGC CCTCAATTCA TTTGTCACCA 3360 

TCGCTGAGGA GCAAGCTCTT GTTCAAGCTA AAGCCATTGA TGAAGCTGGA ATTGATGCTG 3420 

ACAATGTCCT TTCAGGAATT CCACTTGCTG TTAAGGATAA CATCTCTACA GACGGTATTC 3480 

TCACAACTGC TGCCTCAAAA ATGCTCTACA ACTATGAGCC AATCTTTGAT GCGACAGCTG 3540 

TTGCCAATGC AAAAACCAAG GGCATGATTG TCGTTGGAAA GACCAACATG GACGAATTTG 3600 

CTATGGGTGG TTCAGGTGAA ACTTCACACT ACGGAGCAAC TAAAAACGCT TGGAACCACA 3660 

GCAAGGTTCC TGGTGGGTCA TCAAGTGGTT CTGCCGCAGC TGTAGCCTCA GGACAAGTTC 3720 

GCTTGTCACT TGGTTCTGAT ACTGGTGGTT CCATCCGCCA ACCTGCTGCC TTCAACGGAA 3780 

TCGTTGGTCT CAAACCAACC TACGGAACAG TTTCACGTTT CGGTCTCATT GCCTTTGGTA 3840 

GCTCATTAGA CCAGATTGGA CCTTTTGCTC CTACTGTTAA GGAAAATGCC CTCTTGCTCA 3900 

ACGCTATTGC CAGCGAAGAT GCTAAAGACT CTACTTCTGC TCCTGTCCGC ATCGCCGACT 3960 

TTACTTCAAA AATCGGCCAA GACATCAAGG GTATGAAAAT CGCTTTGCCT AAGGAATACC 4020 

TAGGCGAAGG AATTGATCCA GAGGTTAAGG AAACAATCTT AAACGCGGCC AAACACTTTG 4080 
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AAAAATTGGG TGCTATCGTC GAAGAAGTCA GCCTTCCTCA CTCTAAATAC GGTGTTGCCG 4140 

TTTATTACAT CATCGCTTCA TCAGAAGCTT CATCAAACTT GCAACGCTTC GACGGTATCC 4200 

GTTACGGCTA TCGCGCAGAA GATGCAACCA ACCTTGATGA AATCTATGTA AACAGCCGAA 4260 

GCCAAGGTTT TGGTGAAGAG GTAAAACGTC GTATCATGCT GGGTACTTTC AGTCTTTCAT 4320 

CAGGTTACTA TGATGCCTAC TACAAAAAGG CTGGTCAAGT CCGTACCCTC ATCATTCAAG 4380 

ATTTCGAAAA AGTCTTCGCG GATTACGATT TGATTTTGGG TCCAACTGCT CCAAGTGTTG 4440 

CCTATGACTT GGATTCTCTC AACCATGACC CAGTTGCCAT GTACTTAGCC GACCTATTGA 4500 

CCATACCTGT AAACTTGGCA GGACTGCCTG GAATTTCGAT TCCTGCTGGA TTCTCTCAAG 4560 

GTCTACCTGT CGGACTCCAA TTGATTGGTC CCAAGTACTC TGAGGAAACC ATTTACCAAG 4620 

CTGCTGCTGC TTTTGAAGCA ACAACAGACT ACCACAAACA ACAACCCGTG ATTTTTGGAG 4680 

GTGACAACTA ATGAACTTTG AAACAGTCAT CGGACTTGAA GTCCACGTAG AGCTCAACAC 4740 

CAATTCAAAA ATCTTCTCAC CTACTTCTGC CCACTTTGGA AATGACCAAA ATGCCAACAC 4 800 

TAACGTGATT GACTGGTCTT TCCCAGGAGT TCTACCAGTT CTCAATAAAG GGGTTGTTGA 4860 

TGCCGGTATC AAGGCTGCTC TTGCCCTCAA CATGGACATC CACAAAAAGA TGCACTTTGA 4920 

CCGCAAGAAC TACTTCTATC CTGATAACCC CAAAGCCTAC CAAATTTCTC AGTTTGATGA 4980 

ACCAATCGGA TATAATGGCT GGATTGAAGT CAAACTAGAA GACGGTACGA CCAAGAAAAT 5040 

CGGTATCGAA CGTGCCCACC TAGAGGAAGA CGCTGGTAAA AACACCCATG GTACAGATGG 5100 

CTACTCTTAT GTTGACCTCA ACCGCCAAGG GGTTCCCTTG ATTGAGATTG TATCTGAGGC 5160 

AGATATGCGT TCTCCTGAAG AAGCCTATGC TTATCTGACA GCCCTCAAGG AAGTTATCCA 5220 

GTACGCTGGC ATTTCTGACG TTAAGATGGA GGAAGGTTCG ATGCGTGTGG ATGCCAACAT 5280 

CTCCCTTCGT CCTTATGGTC AAGAGAAATT CGGTACCAAG ACTGAATTGA AGAACCTCAA 5340 

CTCCTTCTCA AACGTTCGTA AAGGTCTTGA ATACGAAGTC CAACGCCAGG CTGAAATTCT 5400 

TCGCTCAGGT GGTCAAATCC GCCAAGAAAC ACGCCGTTAC GATGAAGCGA ATAAAGCAAC 5460 

CATCCTCATG CGTGTCAAGG AAGGGGCTGC TGACTACCGC TACTTCCCAG AACCAGACCT 5520 

ACCCCTCTTT GAAATTTCTG ACGAGTGGAT TGAGGAAATG CGGACTGAGT TGCCAGAGTT 5580 

TCCAAAAGAA CGTCGTGCGC GTTATGTATC TGACCTTGGT TTATCAGACT ACGATGCTAG - 5640 

TCAGTTGACT GCTAATAAAG TCACTTCTGA CTTCTTTGAA AAAGCTGTTG CCCTAGGTGG 5700 

TGATGCCAAA CAAGTCTCTA ACTGGCTCCA AGGGGAAGTC GCTCAGTTCT TGAATGCTGA 5760 

AGGTAAAACA CTGGAACAAA TCGAATTGAC ACCAGAAAAC TTGGTTGAAA TGATTGCCAT 5820 
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CATCGAAGAC GGTACTATTT CATCTAAGAT TGCCAAGAAA GTCTTTGTCC ATCTAGCTAA 5880 

AAATGGCGGT GGCGCGCGTG AATACGTGGA AAAAGCAGGT ATGGTTCAAA TTTCAGATCC 5940 

AGCTATCTTG ATCCCAATCA TCCACCAAGT CTTTGCCGAT AACGAAGCTG CTGTTGCCGA 6000 

CTTCAAGTCA GGCAAACGTA ACGCCGACAA GGCtTTACAG GATTCCTTAT GAAGGCAACC 6060 

AAAGGCCAAG CCAACCCACA AGTTGCCCTT AAACTACTTG CACAGGAATT GGCGAAGTTG 6120 

AAAGAAAACT AGACAGAACA AAACCAGCCC TAAGGTTGGT TTTTTCTTCT CTACCAACTC 6180 

CCAATAACTA TTTTGGCTTT ATTTCCAGAG TATTTTATGG TAAAATGAAG AGTAATAATA 6240 

TTTATTAAAG AGGTAAAAAC ATGATTGAAG CAAGTACCTT AAAAGCTGGT ATGACCTTTG 6300 

AAACAGCTGA CGGCAAATTG ATTCGCGTTT TGGAAGCTAG TCACCACAAA CCAGGTAAAG 6360 

GAAACACGAT CATGCGTATG AAATTGCGTG ATGTCCGTAC TGGTTCTACA TTTGACACAA 6420 

GCTACCGTCC AGAGGAAAAA TTTGAACAAG CTATTATCGA GACTGTCCCA GCTCAATACT 6480 

TGTACAAAAT GGATGACACA GCATACTTCA TGAATACAGA AACTTATGAC CAATACGAAA 6540 

TCCCTGTAGT CAATGTTGAA AACGAATTGC TTTACATCCT TGAAAACTCT GATGTGAAAA 6600 

TCCAATTCTA CGGAACTGAA GTGATCGGTG TCACCGTTCC TACTACTGTT GAGTTGACAG 6660 

TTGCTGAAAC TCAACCATCT ATCAAAGGTG CTACTGTTAC AGGTTCTGGT AAACCAGCAA 6720 

CGATGGAAAC TGGACTTGTC GTAAACGTTC CAGACTTCAT CGAAGCAGGA CAAAAACTCG 6780 

TTATCAACAC TGCAGAAGGA ACTTACGTTT CTCGTGCCTA ATCTCTAGAA AGAGGTCATT 6840 

CTATGGGAAT TGAAGAACAA CTTGGCGAAA TCGTTATCGC CCCACGTGTA CTTGAAAAAA 6900 

TCATTGCTAT CGCTACTGCA AAGGTAGAGG GTGTTCACTC TTTTTCAAAC AGATCAGTGT 6960 

CTGATACCCT TTCAAAACTT TCACTCGGCC GTGGCATTTA TCTTAAAAAC GTGGACGAAG 7020 

AACTCACAGC AGATATCTAT CTCTACCTTG AGTACGGAGT AAAAGTTCCT AAGGTAGCGG 7080 

TTGCTATCCA GAAAGCTGTC AAAGATGCCG TCCGTAATAT GGCTGATGTA GAACTCGCTG 7140 

CTATCAATAT TCACGTTGCA GGTATCGTCC CAGATAAAAC ACCAAAACCA GAATTGAAAG 7200 

ATCTATTTGA CGAGGACTTC CTCAATGACT AGTCCACTAT TAGAATCTAG ACGCCAACTC 7260 

CGTAAATGCG CTTTTCAAGC TCTCATGAGC CTTGAGTTCG GTACGGATGT CGAAACTGCT 7320 

TGTCGTTTCG CCTATACTCA TGATCGTGAA GATACGGATG TACAACTTCC AGCCTTTTTG 7380 

ATAGACCTCG TTTCTGGTGT TCAAGCTAAA AAGGAAGAAC TAGATAAGCA AATCACTCAG 7440 

CATTTAAAAG CAGGTTGGAC CATTGAACGC TTAACGCTCG TGGAGAGAAA CCTCCTTCGC 7500 

TTGGGAGTCT TTGAAATCAC TTCATTTGAC ACTCCTCAGC TGGTTGCTGT TAATGAAGCT 7560 

ATCGAGCTTG CAAAGGACTT CTCCGATCAA AAATCTGCCC GTTTTATCAA TGGACTGCTC 7620 
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AGCCAGTTTG 
GCTAAGCTCG 
TCCGTTTTTT 
TG AG ATT ATT 
TCTTTTCTTT 
TAAGATTGTC 
GCAAGAGCTG 
CTAAAATCTC 
TCAGTCTACG 
GGGATTTTCG 
GATGTTGTAC 
GGGAGACTGT 
GTAAGGGCTA 
GCGAAGAAAG 
GAGATTGTTA 
CCAAGACATA 
ACGAATAACA 
TTCAATCAAC 
GACGATAGAA 
TCTAAGGAGA 
TTGAAAGTCA 
CAGTTTGGCG 
AGGGTCTTTA 
AATGAGTTGT 
TAGGCTCCAT 
AAAGAAAAAG 
AATGATAATA 
TAAAAAAACT 
CAAAAAAGAA 



TAACAGAAGA 
AGAAAGGACA 
GAAGTTTTCA 
GGTCGCTTCC 
ATCTTTGAGG 
CTCAATAAGT 
ATAGAGCTGA 
TTTATTGGTT 
GCTATCCTGT 
ATGAAACTGA 
AATGTGAAAG 
TTCAGCCTGA 
AACATATCCA 
TGATTTCGAA 
AAATCTTGCG 
ATCTCAGGAA 
GTTGAAGTTG 
TTTTGAGCAA 
GTTTCAGCGA 
ATTCTAGTAG 
TATTTCTTCA 
ATGATTTCCT 
ATGTCTAGTA 
TTTGTCGCTT 
AATATCTATA 
TGTTTGATAG 
GATATAATTG 
ATGGCAGAGA 
ACCAATCAGA 



ACAATAAGGC 
AATTTCGTCC 
AAGTTTCGAA 
AGTTTGGCAT 
AAGGTTTTAA 
CCGAAAAATT 
TAGTGGTGTT 
AAGTGCATAC 
TGAATGAGTT 
TTCATGATTT 
CGATCAAGAA 
GCCTAGGAAT 
TAGTAATAAT 
TGATAGCTTG 
CAATGAAGCT 
GACAAGAAAA 
AGATGGAAAG 
TCTTTTGGTT 
CCATCATTTT 
GCATACCAGT 
ATTGGTTTCC 
TGTGTGTATC 
ATTTTGTGAT 
TTCATTATAG 
GGGGATTTAC 
ATATCAAACA 
TAAACAAAAA 
ATCGTTAATC 
CTATAATATA 



199 
TCTTTGTCAA 
TTTCTTTTTT 
AACCAAAGGC 
TAGAATAGTG 
AGACAGTCTG 
TCTCTGGTTC 
TCAAGTCTTG 
GAAAAGTAGG 
TCCAGTAGCG 
GGACACGCAC 
CGATTTTAGC 
TTGAAAGCGA 
TTTGACGCGA 
TGTTCTACCC 
CATCTTTCCC 
ATCATGTTTA 
CTGATGGGCA 
GATGATACGA 
TGAACAGTGA 
CGTTTCAAGA 
GCACTCAGGG 
CTTATTGATG 
AAAATGTAAT 
GTCATATGGG 
CCACTACAAA 
CTTTTTTCTT 
TCCAGATAGG 
TCAGATTGTC 
ATAAACTAAT 



CTGTAGTGGG 
GATGTTCAAA 
ATTGCGCTTG 
TAGTTGAAGG 
AAAAATAGGA 
CTTATTCTGG 
TGAATGGCTC 
ACGATAAAAT 
CTTGATATCC 
ACGACTCATG 
ATTCGGGAGT 
AGCTGTTTAG 
CATCGGACAA 
TCAAGAACAG 
TTTGTAAAAG 
AAGTGAAAAT 
ATATCAGTCA 
GGGATTTGGT 
TAGCACTTGA 
TAAGGAATTT 
CAAGATGGGG 
ATGTCTAAAA 
TGTTCCATAT 
ACTTTTTTTC 
TATTATAGAG 
TGCCTCCCAC 
TTTTGCATGA 
GGTAGAACGA 
TGGATCTCTG 



TTGAAAAAAA 
GCGATAAAAA 
ATAAGTTTGA 
GCGTTGACAA 
TGAGCCTGCT 
AAGTGAAACA 
AAAAGCTTGT 
CGCTTATCAC 
TTGTATTCAT 
GCACGGCTAA 
GAAACAGTCT 
CCAAGTCATA 
CTCTATCGTA 
TGATGATATT 
CATACTCATC 
CATTGAGCTT 
TAGAAATCTT 
GATTTTTCTT 
ATCGACGCTT 
TAGAAGGTTT 
CGTCGTAGTC 
TCTGGATATT 
GAATCTTTCT 
TACAATAAAA 
CCAACAATAA 
TATCTAAAAA - 
TTGAGAAAGT 
TAAACAAGGG 
TGAGATAGTA 



7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 

8940 

9000 

9060 

9120 

9180 

9240 

9300 

9360 
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TCAAATGGCT 
AGGGAAAGAA 
TGATTACAGG 
AAAGCACGGT 
CCCAAAAATT 
ACCAGTTCCA 
TTAATAGAAG 
CTAACCCTAA 
AAATAATAAG 
TTTGAAAGAT 
CCCTTGCCTC 
CCAAACAGCA 
ATTTGTACCT 
GTATATTTCT 
GAATTACATT 
TTTCTAAATC 
GCCAGCTCCG 
ATTTCAGAAA 
ATCAGTTCTG 
TCTGGACTGA 
AGAGCCAGAG 
AGGTCACTCA 
CCGACACCAG 
TCAGCCTGTT 
TTCCGTTTGA 
TGGTCTCCAA 
CCATGTAATT 
CTCTGCTCAC 
ATCTGACTAG 
AGCCATGTTC 



AATCCCAAAG 
GGTATTCATA 
CAGGACAAAA 
AAAATACTCA 
ACCGAGAATC 
GCTCTTTTTC 
GAAACTPCCC 
AATGATCGTC 
CAAAAATATT 
TACCCTGCTC 
ATATAGGGAG 
AGACCACCTG 
GTCATTGCAA 
TTTTTTGTCT 
TTTCCATAAA 
GTTGATAGCG 
TTTGGAGTTC 
TCACCTTATC 
CTGCTTCCAT 
GAATGGCATA 
CCCCGCCTGA 
TTTCCATGAG 
GATAAGCACC 
TCATCAACCG 
GGTTGTCTTG 
GCCAACCAAT 
GGATAAATTC 
GCGCTTCTCT 
GCTAGCAATC 
TAATAGGAAT 



ATGATAGCAG 
AAATACCCTC 
GATAAGATAG 
TCATGAATAT 
TGATAAACCA 
TCAAAGATAA 
ACTAATCCCA 
ACATACAATC 
CCAAATTGTC 
GGAAGCCGTA 
CAAATTCTCT 
AAGTTTGCTC 
GTACCTTCTT 
ATTTTATAGC 
AAATGAGACC 
TTCTTCCAGC 
TTTTTTGACA 
CACCACGTCC 
AGCGCGAGTA 
GATAGAATTT 
ACCACCTTCA 
ATTGCGAGCG 
TGCTGTATTG 
CAGTGCCTTT 
CAAACTCTTG 
ACCACCAACA 
ATCAAAAATG 
GACTATTTTT 
GTATCTGGTA 
TCTGCCTTTT 



200 
ATAGGATAAC 



TATCAAGAGT 
TCGATAAAAA 
TCCTATGATT 
CATAAGTTGC 
AGAGCATCTT 
TTGTTAAAAT 
CAATTGTTTG 
TTAGTTTTTT 
CTTCCAAGCA 
ATAATATAAC 
CAAGTCCTCA 
AAAATAGATT 
CCATCTCCTC 
TTTCTAGTCT 
AACTCTTCTA 
CTCTTAATCA 
ATTTCTAACA 
CCGTCCTTCC 
TCCAGCATCC 
CCGATAATAA 
ATAGCTTCCC 
ATAAAGGTCA 
CGGTAGCCTT 
CCTTTTTGGA 
ACTGCACCAT 
CCTGTCGCAA 
GCAATATTCA 
AGTCTCTTCT 
GGAAATCCTC 



ATCCAAATAG 
CTCCTCAAAA 
GGTTGGTTGT 
AATCAAATGA 
AAATAAGTAG 
TTTCTTTTTT 
AAGAGAATAG 
TGGTAAATAG 
TGTGTTTCTC 
TCTATATAAG 
CATCTACTAT 
GTTGAAAGAA 
GTTGTAGGCT 
AACTGGCAAT 
CATTTAGTCA 
GCGGTTTTTG 
GTTCTTTACT 
GTTCATGCGA 
ATAAAATGGA 
AGACACGGTC 
TGGCGATAAT 
CTTGACCACG 
CAACTGGACG 
CTGGATGTGG 
TACCAACCAC 
CATCACGAAA 
AGTCCAAGGT 
TCTAGGACTC 
TTTGACAATA 
AGGCAAGCTT 



TACTTGGACT 
ACAGGACCGA 
CCATTTGAAA 
GCATAGCGTG 
AAGACAAATG 
AACCTCCAAA 
ACATCAGCTC 
GTAGATAGTA 
ATCGTACTTT 
AATTAAGTGC 
ATCCATCTTC 
CTGTAAATGT 
CACATTTATA 
TTTTCGACCT 
TTCTTAGTAT 
TGAAAGTCTA 
AGAAAGTCCT 
AGTGATTTTC 
AGCAAAGCCT 
CGCGACAGCT 
AGGAACTTTC 
TTCTTCCGCT 
GCCAAATTTC 
TTGGCCAAAA 
TGTTACAGCT 
AGAACGGTCA 
TGTCAAGCGA 
CCTCCATGCA 
GCATCCACAA 
TCACGAACCG 



9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
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TATTTTCAAT CACACGACGC CCAGCAAAAC CAACCAAGCT CTGTGGTTCA GCCAGAATGA 11220 

TATCGCCTTC CATAGCGAAA GAAGCTGTCA CACCACCAGT CGTTGGATCT GTCAAAATGG 11280 

TCAGGTAAAA GAGACCAGCA TTTGAATGGC GTTTAACCGC CGCAGAGATC TTAGCCATCT 11340 

GCATGAGACT CATGATTCCT TCCTGCATAC GGGCTCCACC AGAGGCTGTG AATAGGACAA 11400 

CTGGCAATTT TTCGACAGTC GCATACTCAA ACAAACGAGT GATTTTTTCA CCTACAACCG 11460 

TACCCATAGA AGCCATGATA AAGTTAGAAT CCATAATCCC AAGAGCCACA GTCTGACCTT 11520 

TAATAAGAGC AGTTCCTGTC ACAACGGCTT CATGCAGACC TGTTTTTTCA CGCATAGATG 11580 

CCAGTTTCTT TTGGTAACCA GGGAAATGCA AGGGATCCTT GCTTTCAATC CCTGTAAACA 11640 

ATTCTTTGAA GGTTCCCATA TCAATCGTCA AAGCCAAGCG TTCTTGGGCA GAAATACGAA 11700 

AGGTATAGCT ACAGTGCGGA CAGATACGTT CACTTCCCAG ATCCTTCTGA TAGATGGTAT 11760 

GCTTACAGCC TGGACACTGG GAAAATAATT CATCTGGAAC CTCTGGCTTA GCTTGAGGTT 11820 

TTTCCCTAAC CGAACGATTG GGATTGATTC GAATATACTT ATCTTTTTTA CTAAATAGAG 11880 

CCATTGATTC CCCTTTTCGG TTTAAACTCT TAAAGTCATT TTATTCTTTT TCTTGATATT 11940 

TAGGTAAGAA GGTTTCCATC AAGAAGGAAG TATCATAATC CCCAGCAATG ACATTGCGAT 12000 

CTGAAATGAG GTCAAGCTGG AAATCTGCAT TGGTCTGCAC TCCTTCAATT TCTAATTCAT 12060 

AGAGGGCACG TTGCATTTTC ATCAAGGCGT CAAAACGATT TTCGCCGTGT ACTATGATTT 12120 

TGGCAATCAT ACTATCATAA TAAGGCGGAA TGGTATAACC TGGATAAACT GCTGAATCCA 12180 

CGCGCAAGCC AACTCCACCA CTTGGCAGAT AGAGATTAGT AATCTTACCT GGACTTGGAG 12240 

CAAAGTTAAA GGCTGGGTTT TCTGCATTGA TACGACACTC GATGGCATGA CCGCGTAGGA 12300 

CAATATCTTC TTGCTTAACA GACAAAGGCT GACCTGCCGC AATGCAAATC TGTTCCTTAA 12360 

CGATATCAAC ACCTGAAACA AACTCTGTTA CTGGATGTTC TACCTGAACA CGAGTATTCA 1242 0 

TCTCCATGAA ATAGAAATTG CTACTTGCTT CATCAAGAAG AAATTCAATG GTTCCTGCAT 12480 

TCTCATAGCC AACAAACTCT GCCGCTCGAA CAGCAGCAGC ACCTATTTCA TGACGCAGCG 12540 

TTTTTCCGAT TGCAATCGAG GGACTTTCTT CCAAAACCTT TTGGTTATTC CTTTGAAGAG 12600 

AACAATCCCG TTCACCCAAG TGAATCACAT GTCCATGCTC ATCACCTAGG ATTTGAACCT 12 660 

CAATGTGCCG AGCTGGATAG ATAACCCGTT CTATGTACAT GGCACCATTG CCATAATTGG - 12720 

CCTTGGCCTC ACTAGAGGCA GTTTCAAAGG CAGAAACGAG GTCATCTGGT TTTTCAACCT 12780 

TACGAATCCC TTTACCACCT CCACCTGCTG AAGCCTTGAG CATAACAGGA TAGCCAATTT 12840 

TTTCAGCAAC AATCAAAGCT TCTTCAGAGT TATGCACTTC TCCATCTGAA CCTGGTATAA 12900 



WO 98/18931 



PCT/US97/19588 



202 

CAGGCACACC TGCTTTAATC ATCTGAGCAC GCGCATTGAT CTTATCCCCC ATCATATCCA 12960 

TAACATGACC AGATGGACCG ATAAACTTGA TACCTACTTC TTCACACATG GTCGCAAATT 13020 

TGGAATTTTC ACTGAGAAAT CCAAAACCAG GGTGAATAGC TTCTGCCTCA GTCAAGACTG 13080 

CAGCTGATAG AACTGCATTA ATATTGAGAT AAGACTCTGT TGCCTTGCCA GGACCAATAC 13140 

AAACTGCTTC ATCTGCCAAA AGCGTATGAA GAGCTTCCTT ATCAGCAGTT GAATAAACCG ' 13200 

CTACCGTCGC AATCCCCAAT TCACGTGCCG CACGGATAAT ACGAACCGCA ATTTCACCAC 13260 

GATTGGCAAT TAAAATTTTT CGAAACATGG AGAACCTCCT TAGTTCCCAA TTGCAAAAGT 13320 

AAGGGTACCA CTGGCTGCAA GCTTGCCATC CACTTCAGCC TTTGCTTCAA CCACAGCTAT 13380 

GGTGCCACGA CGTTTTACAA AAGTCGCTGT CATAACCAAT TGGTCGCCTG GTACAACTTG 13440 

CTTCTTGAAC TTAACCTTGT CCATACCAGC GTAAAAGACC AGTTTTCCTT TATTTTCAGG 13500 

TTTTGATAAC TCCAACACAC CGGCAGTTTG CGCCAAGGCT TCCATAATCA CAACACCTGG 13560 

CATAACTGGG TATTGAGGAA AGTGGCCGTT AAAGAAAGGC TCGTTGATGG TCACATTTIT 13620 

GATAGCAACA ATGGTATCCT CGCTCACTTC CAAGACACGG TCCACTAGAA GCATAGGATA 13680 

ACGGTGGGGA AGAGCTTCTT TGATTCCTTG AATATCGATC ATTTGATACG TACCAATCCT 13740 

TTACCAAACT CAACCATTTC TTCGTTAGAG ACGAGAATTT CCGTTACCAC ACCATCCTTA 13800 

GGAGCTGGGA TTTCATTCAT GACTTTCATG GCTTCGATAA TTACCAATGT TTGACCTTTT 13860 

TTGACACTAT CACCAACTGT AACGAAGGCA GGTTTATCTG GTCCAGCAGC CAAGTAAACC 13920 

ACTCCAACAA GTGGACTCTC TACAAGATTT CCCTCAGTAG CCACACTTGC TTCAGCTGGA 13980 

GCTGGAACTT CTTCTGCTAC AGTCTCTGCT GGAGCAGATG TAGGAGCTAC TGGACTCGGT 14040 

GTTGCTAGAA CGGGTGCTGG AGCGACTTGA GTTGCAACTT CAGGCACAGG TCTTGCTTCA 14100 

TTCTTGCTAA ACTGCAACTC ATCCGTCCCA TTTTTATAAG AAAATTCTCT CAAACTTGAC 14160 

TGGTCAAATT GAGTCATCAA GTCTTTAATA TCGTTTAAAT TCATACTTAT CTATTCTCCC 14220 

AACGTTTGAA AGCAAGAACT GCATTGTGGC CTCCAAAACC AAAAGTATTT GAAATAGCGT 14280 

ATGGAATTTC TTTCTCCAAG CCTTGTCCAT AAACGACATT AGCTTCGATA TAATCTGATA 14340 

CTTCACTTGT CCCAGCTGTC ATTGGTACAA AGTTATGACG CATAGCTTCG ATGGTGACGA 14400 

TAGCTTCTAC TGCACCCGCA GCCCCCAGCA AATGTCCTGT AAAAGACTTG GTTGATGATA 14460 

CAGGTACTTC CTTACCAAGA ACAGCTACGA TAGCACCACT TTCTCCTTTT TCATTGGCAG 14520 

GAGTTGACGT TCCGTGAGCA TTGACATAGG CTACTTGCTC TGGAGAAATC TCAGCTTCTT 14580 

CCAAGGCTAG TTTGATGGCC TTGATAGCTC CCTGACCTTC TGGATGTGGA GAAGTCATGT 14640 

GGTAGGCATC ACAAGTATTT CCGTAACCAA CCACTTCAGC CAGGATAGTA GCTCCACGTT 14700 



WO 98/18931 



PCT/US97/19588 



TTTCAGCGTG 
CATTGCGATC 
CTGTTAAGGC 
CCAACATCAC 
TTGATGAAGA 
CTACATTCCC 
TGGGTCCTTT 
CAGATGCAAC 
GATTTACAGC 
TATCTTTTTT 
CATCAAAGTC 
AACTATTCCA 
CCACTACTCG 
CATCAATGGC 
CTGCAACCTG 
TAATCTTATC 
TGACTCGTAT 
CAGCCTTAGA 
ACATATTAAT 
TATTAAAGGC 
TGAGCATAAG 
GTTCTGCAAT 
AAATGGGAAC 
CCCCACGACT 
GACCAATTCC 
TCCTTTCAAA 
GATCTTCCAC 
CTTTCCCCGG 
TTTCATAGAA 



TTCAAGACTT 
CTTATCAAAT 
TTGGAAACCA 
ATCTTGGAAA 
GCAGGCAGTA 
AGAAGCCATA 
TTCATGAAGG 
GATAACACCA 
CTCTTGGGCT 
TACAAAGTAT 
ACTATGATCA 
AAATTCTTCT 
ATTTAGTTTC 
AACCACTTGT 
CTCTGCCTGC 
TGACAGGATA 
ATTCCGACTA 
AGCAGCATAA 
GATAGCACCT 
ACCAGTCAGA 
AGTATCTTGG 
AGCTTGATCA 
CACCTTGATA 
GTTTAAGACA 
ACGACTCGAA 
ACTTCTACTT 
ATGAGCTAAG 
TCCAATCTCG 
ACGAACGGGT 



TCTAGAACCA 
GGGATCGAAG 
GCGATGGCAA 
CCAAACTTAA 
TTGATAGATT 
TTTGGTAAAG 
CGAAGTACCT 
AAACGATCCC 
GCATACAAGG 
TTATCGAACG 
AATTTTGTAA 
GGTGTATTTC 
ATTCTTTTCA 
CCAGTTAGAT 
CCAAATTCTT 
GCGGTCATAT 
GCGACCTCGC 
TTAGCTTGAC 
TCTCTGGCTT 
TTGACCTTGA 
GTAATCCCTG 
ATCATACGCT 
CCATAGTTTG 
ATGTTGGCTC 
CCTGTAATAA 
ATTTTAGTCT 
TGAGCAGTTT 
ATAAAGTTGC 
TCCTTGACCT 
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ACATCCCTGA 
CACGAGTTGG 
AAGGTGTGAT 
TGGAGCGGAA 
TACAAACACC 
CTTTTGGAAG 
GATCTTCAAT 
TATTAAGAGC 
CATATAAAGA 
GAAAATCTTG 
TGCCACCAAT 
CGATTGGAGA 
CCTCTAGCTT 
AATCTTGGCC 
TCATCGGAAT 
CAGACTCAAT 
GTGCCACAGA 
CAATATTCCC 
TCATCATCGG 
GCACTTTTTC 
CATTGTTGAC 
TAGCGTCTGC 
AAAACTCAGC 
CTGCTTGAGC 
AGATATTTTT 
ATTTTTCTAA 
GATCAATTTT 
TTATGCCTGC 
GACGCGTCAA 



ACCTTCACCC 
ATCCTCTGTA 
AGAAGCTTCT 
GGCATCCCCA 
GTTTGCACCA 
AGTCATTGGT 
TTCCTTGATT 
CTCTACATCA 
ATAGTTATCA 
GATTTCTGCC 
GCCGATTTTC 
TGTTACTCCA 
TCGCTACATA 
TGCTAAAAAT 
CTGAGCTAGT 
CATTCCTGGA 
CTTGGTAAAG 
CATCAAACCA 
TTTCAAGACT 
AAAATCTGCT 
CAAAACATCT 
AAAATCTGAT 
GAGCAATTCT 
AAACTTGTGG 
ATGTTCTAGT 
AAGTGCTACT 
TTTAACAAAA 
TTCTTGCATG 
GAGCTGAGCA 



ATAACAAACC 
GTAGAGAGAG 
GTTCCTCCCA 
ATCGCATCAT 
AAACGCATGG 
TTGACACGTT 
CCACCAATAC 
AGATTGGCAT 
AAACGGTTGG 
GCATTATGCA 
CCAGTTGCTA 
TAACCTGTTA 
CTTAAGCCAC 
ACTGTCAAAT 
GTAGCTTCCT 
GCAATCACAT 
CCAATCAAGC 
ACAACACTAG 
GATTGTGTCA 
TCTGTCATCT 
ACTGAACCCA 
ACATCTCCTG 
TCTGAGATTG 
GCGATGGCAA 
TTCATTTTTT 
AAACTCGCTT - 
CCTGACAAGA 
ACCCCAATAC 
ATGTCCTCTT 



14760 

14820 

14880 

14940 

15000 

15060 

15120 

15180 

15240 

15300 

15360 

15420 

15480 

15540 

15600 

15660 

15720 

15780 

15840 

15900 

15960 

16020 

16080 

16140 

16200 

16260 

16320 

16380 

16440 
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_ 204 
TTTGCATCAC AGCAGCTTCT GTATTGCCGA CTAGGGGACA 

CCTGAGCTAG AGTTTCAGCT AGTTTCTGGC TAGCAGGTTC 

GACCTGACAC CTTAAGAGGA ATCAAGCGTT TGGCACCTGC 

CTCGATCAAC TGCAACCACT TCTCCAGCAA TGACGATTTG 

CTGGAGTAAC CACTCCAAGT TCAGAAGCTT TTTGACAGGC 

GCGTATTGAG AACTGCTACC ATCTTGCCAG AGTCAGCAGG 

CTCCACGCTT AGCTACCAAG GCAACCGCAT CTTCAAAATC 

AGGCAGAGTA TTCTCCAAGA GACAAACCAG CAACCATATC 

GCAATAAACG GTAGATAGCA ACCGAAGTCG CTAGAATGGC 

GATTGAGTTT GTCTTCTTCC GTATCGATGA GATAACGCAA 

TCGCTCGATC AATCGTTTCT TTAACAATCG GATACTGATC 

CTAGATACTG GGCACCTTGA CCAGCAAATA AAAAGGCTGT 

CCTGTCCAGC GAGAGGCTTC TTCTTGAATT TTCTTAGCGG 

AGGATTTCTT CAGCTGTTTC TTCTTTAGAA ACAAGCCCTG 

GAGCCACCAT CCACATCACC GTGAACAACT GCTTTGGCTA 

TCAAAGATTT CTAAATCAGG ATCTTCTTGC TTAAAGGCAT 

TCTCTAGTCA ACTGATTTTT AATAGCACGA ACAGCATGAC 

GTAGTATCAA TATCCCTTGC TTTTAAAATT TTCTCCTTGT 

TCTTTTGCAA CTACAAACCG TGTCCCCACC TGTACAGCCT 

GCCGCAGCAC CTTCACCATC CGCAATTCCT CCTGCAGCAA 

GTGGCTACCT GTCGCACCAA GGTCATGGTT GTTAATTTAC 

ATTCCTTCTG CAATAACAGC GTCTGCACCG ATTTTTTCCA 

CTAGGAACAA CAGGAATAAC GATTATCCCA GCTTCATGGA 

GGATTTCCTG CTCCTGTTGT GACAACTTTA ACACCTTCTT 

TCTTCCACAA AGGGAGATAA GAGCATGATG TTGACCCCAA 

TTGATTTTAT CAATATTGGC CTTGACAACT TCTTTCGGGG 

CCTAATCCTC CAGCCTTGGA AACAGCCCCT GCCAAATCAC 

CCTCCTTGGA AAATAGGATA ATCAATCTTC AATAATTCTG 

CCTCCAACCT TCCTTGCTTA CGTAATAGTT CGATTTCACC 

TTACCTAAAC AAGAGGGAGT GGGTTTCTCC CTACTCCTTC 



AGTAAAATCT GAAAAACTTA 
AAGGAGAGCG GTGTGAAAGG 
TTCTTGCAAA AGTTCAACCG 
TGCAGGTGTG TTATAGTTGG 
TTCTTCAATG ACCTCTACTG 
AGCCGCTTCT TCCATATAGG 
CAAGGCGCCA CTTGCCACCA 
AGGCTGATAG CCCTTTTCTT 
TGGTTGCGTA TAGCGGGTCT 
ATCATAACCG AGCACCTGGC 
ATAGAAATCC CGTCCCATCC 
TTTAGTCATT TCTTACAACT 
CTCCGTAATA CAAATCTTTT 
CGATTTGACC TGCCATAACA 
GAGCACCTGC TCCCATTTGT 
CTTTTTCAGC CAGTTCAAAA 
CAAAGTGCTG AGCTGAAATC 
AGTTTGGATG GGCATTCGAC 
CTGCACCTAG CATAAAGCCA 
TAACAGGAAT AGATATAGCT 
CGATATGCCC CCCAGCTTCC 
TGCGTTTAGC TAAAGCGACA 
AACGTTCCAT ATACTTGCTT 
CAATAACGAG ATCCACX3ATG 
AGGGTTTATC AGTCAATGAT 
CATTTCCCCC ACCGATAATT 
CATCAGCAAC CCAGGCCATC 
TAATACGCGT TTTCATAGTG 
ATAATTTGAC AGTCAAACTA 
TACTAATATT CTGCTTATTT 



16500 

16560 

16620 

16680 

16740 

16800 

16860 

16920 

16980 

17040 

17100 

17160 

17220 

17280 

17340 

17400 

17460 

17520 

17580 

17640 

17700 

17760 

17820 

17880 

17940 

18000 

18060 

18120 

18180 

18240 
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TGCTTGCTCT TCAACGTAAG CAACCAAGTC ACCAACTGTT 
GATTTGGATA TCAAAAGCAT CTTCGATTTC TGAGATTACT 
TGCGTCCAAA TCATCAAAAG TTGATTCAAG TGTTACTTCT 
TTCAACGATA ATTTCTTGTA CTTTTTCAAA TACTGCCATG 
ATAGTTTTTT TATAACAATG TGTTCACCAC ATGATTACCT 
CCCCAGGTCA AGCCTCCACC GAAGCCTGAT AGAAGAACAG 
ATGAGACCTT GTTCTACACA CTCTGAAAGT AAAATCGGGA 
CCATATTCCA TCATATTGGC TGGAAGTTTG GCTCGGTCAA 
TTATCCAAAA TACGGTCATT GGCTTGATGA AGTAGCAGAT 
ATAGGAGATT CATCAATAGT CTGCTTGATA GACTTGGCTA 
AAGACTGTGC GTCCATCCAT CTTCAAAAAC GAATCTGCAC 
GAATGTAAAC CTGAATGCCC ATAAGTTAAA CACTCGCTGC 
CTCTCAGCTA AGAAATGCTC TTGCTCGCTA GCTTCTAACA 
CCAAACAACA CAGCTGTTGA TCGATCCGAC CAATCGACTG 
CCAATCACCA AGCCTTTTTG AAAGCGACCA GAAGCGATAA 
GCAAATACAA ATCCACTGCA AGCCGCGGTT AAGTCAAAAG 
ATATTAGCTT GAACACGAGC AGCTGTAGAG GGCATCATCG 
AGGATGATAA AATCCAGTTC TTCTCCTGTT ATTCCAGCTT 
ACCTCTGTAG CCAAATCACT GGTAGATTCT GTTCTTGAAA 
GTTCGACTTG AAATCCACTC ATCATTGGTA TCCATAATCT 
ACCACTTGCT CTGGCACATA ATGAG CAACC TGACTTATTT 
AATCCTCCAA AAATTGGTAA AGATTAGTCA AACCTTTACC 
CGCTCATGCC ATCAATAATT TTTTCTACCA TGGCCTTGTG 
GAATCAAGCG ACCCTTCTTT GTCAAATGCA GATGCACCAC 
GAACTCGCTC AATGTAGCCC GG 
(2) INFORMATION FOR SEQ ID NO: 8: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6211 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



TTCAAGTCAT 
TGGAACAAGT 
GATGCGTCTT 
ATAGGACTCC 
AAATTGTAAG 
TCTGGCTACC 
TACTGGCTGC 
CACCAATTTT 
AATCCAAGTC 
CATCTCGAAT 
TTTCTTGATC 
GACTTCCATC 
AGACACCACC 
CCTTAGAGAG 
ACTTTTCAGC 
CAAAGGCTTT 
AATCTGGAGT 
TTGCCATCAG 
TATGCCTTTG 
GAGCCAAGTC 
TTGCAAAAGC 
CATGACAGCA 
GAAGCGTTTA 
ACGACGATCC 



TTTCTGCTTC 
CCAATGAATC 
TTCCAAGTTC 
TTTAAAATAA 
AATGAGCGTG 
ATCTAAAGGG 
ACTGGTATTG 
TCTAGCCATC 
TGTCACCTCT 
GGCAAAATCA 
TGAAAATGGA 
GCTATTGAGA 
AGCACCATCT 
GGTTTCACTA 
AGTTGAAAGA 
ATTAGCACCA ■ 
AATGGTAGCT 
TTTCTTAGCA 
TCGTATTCCC 
GTGATTTGTA 
CATTATTTCA 
ATTTCTTCCT 
TGCAGTCTAT 
TGTTCTGACC 



18300 

18360 

18420 

18480 

18540 

18600 

18660 

18720 

18780 

18840 

18900 

18960 

19020 

19080 

19140 

19200 

19260 

19320 

19380 

19440 

19500 

19560 

19620 

19680 

19702 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GAAAATTtcc TCTCTTCTCT TGAAAAATTT TGAAAAAATG GTATGATAGT AACAAGTTAT 
TTTTAAGAGG AAAGAAAGGG GAATAATGGA GAAAATCAGT TTAGAATCTC CTAAGACGGG 
GTCGGACCTA GTTTTGGAAA CACTTCGTGA TTTAGGAGTT GATACCATCT TTGGTTATCC 
TGGTGGTGCG GTTTTGCCTT TTTATGATGC GATATATAAT TTTAAAGGCA TTCGCCACAT 
TCTAGGGCGC CATGAGCAAG GTTGTTTGCA TGAAGCTGAA GGTTATGCCA AATCAACTGG 
AAAGTTGGGT GTTGCCGTCG TCACTAGTGG ACCAGGAGCA ACAAATGCCA TTACAGGGAT 
TGCGGATGCC ATGAGCGATA GCGTTCCCCT TTTGGTCTTT ACAGGTCAGG TGGCGCGAGC 
AGGGATTGGG AAGGATGCCT TTCAGGAGGC AGACATCGTG GGAATTACCA TGCCAATCAC 
TAAGTACAAT TACCAAGTTC GTGAGACAGC TGATATTCCG CGTATCATTA CGGAAGCTGT 
CCATATCGCA ACTACAGGCC GTCCAGGGCC AGTTGTAATT GACCTACCAA AAGACATATC 
TGCTTTAGAA ACAGACTTCA TTTATTCACC AGAAGTGAAT TTACCAAGTT ATCAGCCGAC 
TCTTGAGCCG AATGATATGC AAATCAAGAA AATCTTGAAG CAATTGTCCA AGGCTAAAAA 
GCCAGTCTTG TTAGCTGGTG GTGGAATTAG TTATGCTGAG GCTGCTACGG AACTAAATGA 
ATTTGCAGAA CGCTATCAAA TTCCAGTGGT AACCAGTCTT TTGGGACAAG GAACGATTGC 
AACGAGTCAC CCACTCTTTC TTGGAATGGG AGGCATGCAC GGGTCATTCG CAGCAAATAT 
TGCTATGACG GAAGCGGACT TTATGATTAG TATTGGTTCT CGTTTCGATG ACCGTTTGAC 
GGGGAATCCT AAGACTTTCG CTAAGAATGC TAAGGTTGCC CACATTGATA TTGACCCAGC 1020 
TGAGATTGGC AAGATTATCA GTGCAGACAT TCCTGTAGTT GGAGATGCTA AGAAGGCCTT 1080 
GCAAATGTTG CTAGCAGAAC CAACAGTTCA CAACAACACT GAAAAGTGGA TTGAGAAAGT 1140 
CACTAAAGAC AAGAATCGTG TTCGTTCTTA TGATAAGAAA GAGCGTGTGG TTCAACCGCA 1200 
AGCAGTTATT GAACGAATTG GTGAATTGAC GAATGGAGAT GCCATTGTGG TAACAGACGT 1260 
TGGTCAACAC CAAATGTGGA CAGCTCAGTA TTATCCCTAC CAAAATGAAC GTCAGTTAGT 1320 
GACTTCAGGT GGTTTGGGAA CAATGGGCTT TGGAATTCCA GCAGCAATCG GTGCTAAAAT 1380 
TGCTAACCCA GATAAGGAAG TAGTCTTGTT TGTTGGGGAT GGTGGTTTCC AAATGACCAA 1440 
CCAGGAGTTG GCTATTTTGA ATATTTACAA GGTGCCAATC AAGGTGGTTA TGCTGAACAA 1500 
TCATTCACTT GGAATGGTTC GCCAGTGGCA GGAATCCTTC TATGAAGGCA GAACATCAGA 1560 
GTCGGTCTTT GATACCCTTC CTGATTTCCA ATTGATGGCG CAGGCTTATG GTATTAAAAA 1620 
CTATAAGTTT GACAATCCTG AGACCTTGGC TCAAGACCTT GAAGTCATCA CTGAGGATGT 1680 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
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TCCTATGCTA ATTGAGGTAG ATATTTCTCG TAAGGAACAG GTGTTACCAA TGGTACCGGC 1*740 

TGGTAAGAGT AATCATGAGA TGTTGGGGGT GCAGTTCCAT GCGTAGAATG TTAACAGCAA 180O 

AACTACAAAA TCGTTCAGGA GTCCTCAATC GCTTTACAGG TGTCCTATCT CGTCGTCAGG I860 

TTAATATTGA AAGCATCTCT GTTGGAGCAA CAGAAGATCC GAATGTATOG CGTATCACTA 1920 

TTATTATTGA TGTTGCTTCT CATGATGAAG TGGAGCAAAT CATCAAACAG CTCAATCGTC 1980 

AGATTGATGT GATTCGCATT CGAGATATTA CAGACAAGCC TCATTTGGAG CGCGAGGTGA 2040 

TTTTGGTTAA GATGTCAGCG CCAGCTGAGA AGAGAGCTGA GATTTTAGCG ATTATTCAAC 2100 

CTTTCCGTGC AACAGTAGTA GACGTAGCGC CAAGCTCGAT TACCATTCAG ATGACGGGAA 2160 

ATGCAGAAAA GAGCGAAGCC CTATTGCGAG TCATTCGCCC ATACGGTATT CGCAATATTG 2220 

CTCGAACGGG TGCAACTGGA TTTACCCGCG ATTAAAAATC CAACTTAAAT TTATTAAACC 2280 

AGCCTAAAAG GCAATAAATA ATAGAAAAGA GAGAAAAGCT ATGACAGTTC AAATGGAATA 2340 

TGAAAAAGAT GTTAAAGTAG CAGCACTTGA CGGTAAAAAA ATCGCCGTTA TCGGTTATGG 2400 

TTCACAAGGG CATGCGCATG CTCAAAACTT GCGTGATTCA GGTCGTGACG TTATTATCGG 24 60 

TGTACGTCCA GGTAAATCTT TTGATAAAGC AAAAGAAGAT GGATTTGATA CTTACACAGT 2520 

AGCAGAAGCT ACTAAGTTGG CTGATGTTAT CATGATCTTG GCGCCAGACG AAATTCAACA 2580 

AGAATTGTAC GAAGCAGAAA TCGCTCCAAA CTTGGAAGCT GGAAACGCAG TTGGATTTGC 2640 

CCATGGTTTC AACATCCACT TTGAATTTAT CAAAGTTCCT GCGGATGTAG ATGTCTTCAT 2700 

GTGTGCTCCT AAAGGACCAG GACACTTGGT ACGTCGTACT TACGAAGAAG GATTTGGTGT 2760 

TCCAGCTCTT TATGCAGTAT ACCAAGATGC AACAGGAAAT GCTAAAAACA TTGCTATGGA 2820 

CTGGTGTAAA GGTGTTGGAG CGGCTCGTGT AGGTCTTCTT GAAACAACTT ACAAAGAAGA 2880 

AACTGAAGAA GATTTGTTTG GTGAACAAGC TGTACTTTGT GGTGGTTTGA CTGCCCTTAT 2940 

CGAAGCAGGT TTCGAAGTCT TGACAGAAGC AGGTTACGCT CCAGAATTGG CTTACTTTGA 3000 

AGTTCTTCAC GAAATGAAAT TGATCGTTGA CTTGATCTAC GAAGGTGGAT TCAAGAAAAT 3060 

GCGTCAATCT ATTTCAAACA CTGCTGAATA CGGTGACTAT GTATCAGGTC CACGTGTAAT 3120 

CACTGAACAA GTTAAAGAAA ATATGAAGGC TGTCTTGGCA GACATCCAAA ATGGTAAATT 3180 

TGCAAATGAC TTTGTAAATG ACTATAAAGC TGGACGTCCA AAATTGACTG CTTACCGTGA - 3240 

ACAAGCAGCT AACCTTGAAA TTGAAAAAGT TGGTGCAGAA TTGCGTAAAG CAATGCCATT 3300 

CGTTGGTAAA AACGACGATG ATGCATTCAA AATCTATAAC TAATTAGAAA TATATAGCGC 3360 

TGGAGATGAT TTTATGAAAA AGATTATGAG AAAAATTGCA TCGTTATTAT TGGTTCTAGT 3420 
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TGTATAATGT 
CGTATGATGA 
TGCATATGAA 
AAATTCATTC 
GGATTTTGAG 
AGCTGTTAGA 
GACGATAGAT 
GGGTTTAAAT 
ACTTATTTAG 
GTGATGAAAC 
AATTGACAGA 
ACACGTTGGA 
GTAATGATAG 
ATGAACTTCC 
AAGTTTTCGG 
AGGTTTTATA 
GTTGTGAATA 
TATTTGAAAA 
GCCATTTCCC 
AATCATGCGC 
ATGCCCATTA 
GTAACTATTA 
ACAGTCTCTG 
CAAGGAACAG 
GCTGTCTTGG 
GAAACAAGTC 
GCTGCCTTTG 
ATTGCTGTGC 
TTGGTAGGTG 
GGGATAGTCG 



AATTACACCG 
AAATGCTGTA 
CTTTGAACAA 
TCCGTTTGAG 
AAATAAAAAA 
TTCGGAGAAG 
GGCGGTTTAT 
GACTGGACTG 
GGGTTGAAAT 
TGATAGGCAA 
TCAACTAAGA 
TTTGAATGAT 
TGAAGAAAGT 
TAAAGAGTTT 
AGATGAATAA 
TGTTAAGTTC 
CTCCACTGGA 
AAGAAAATGC 
AGCTCAGCAA 
AGGGAGTAGC 
CTACGCCACA 
AACTAGTTGG 
AAAATCGTAC 
TTGCTTATGA 
TTCCTGTTGG 
CAGAGATTGA 
AGGCTGGAGG 
AAAAGGTAGG 
TCGATGAGGG 
CAGAACCTGC 



TCGGTAATAG 
ATTAACATTT 
TTCTTCAAAT 
AGAGCTGGTG 
CATTAACAAA 
CTTTATCCTA 
ATCAAATAGA 
CGAAAACTTT 
CATATGAATA 
CTGCAAAAAC 
AAATTAGATT 
GTAGAATATA 
TTGGTAGAAT 
TCGATTCGTA 
CTAAAAAACA 
AAAAGATATA 
TTACGATCAT 
CCAGCGTGTT 
GGAAGAACGT 
CTATACTTGT 
ACAAAAGATT 
AGATACCTTT 
CTTTATTGAT 
GATTTTAGAA 
TGGTGGCGGT 
GGTTATCGGA 
TCCAGTAAAA 
TCAGTTGACC 
ATTGATTTCT 
TGGAGCGGCT 



208 
TGCTAGCAGA 



ATGATGATGC 
TGGCACAAAT 
CGACTAAATC 
TATAGTTGGT 
TATTGAAGGT 
GCAAATTATT 
AGCTTCAGCT 
TTACCAATTT 
TATTTTTTCA 
CTAATTTTGT 
AAGAAATTTT 
GGTTATATGA 
TGGCTCATAA 
GTCATTAGTG 
ATCAAGGCTC 
TATTTATCGG 
CGCTCCTTTA 
GAACGTGGGG 
AATGAAATGA 
GGTCAGGTTC 
GATGCCTCAG 
CCTTTTGATG 
GAAGCTCGAA 
CTCATTGCCG 
GTAGAGGCGA 
CTCAAGGAAA 
TATGAAGCAA 
GAAACCTTGA 
AGTATCGCCT 



CCAAAATAAA 
TAATTTTGAA 
AGCTAGAGAA 
TG CTCGTTAT 
AAATCATTAG 
CCTCTTCGCA 
GCATCTGGAT 
ATTCGTGGGA 
GTTTTCTATC 
GTTGGATTTA 
TCCTCGTAGT 
AAACTATTTT 
TTGGATTTCC 
ATACCATGAA 
ACTGTTTTTT 
ACAAGGTCTT 
AGAAGTATGG 
AAATTCGTGG 
TAGTCTGCGC 
AAATTCCTGC 
GCTTTTTTGG 
CCAAAGCAGC 
ATGCTCATGT 
AAGAATCGAT 
GGGTTTCTAC 
ATGGAGCGCG 
TTGATAAATT 
CTCGTCAACA 
TTGACCTTTA 
CTTTAGAGGT 



GCAGATTGGT 
GATGGTAGGT 
GAAGGTCTTG 
ATAGCGAAAT 
GACCTAAATC 
GAATAAATGA 
TGAAAGAATC 
TATTAGATGT 
AAGACAGGAT 
CAATTGGGAG 
CAATTTGTAG 
ATCTTCCATC 
ACAAATCGTT 
AGTGTTACTG 
ATAGAAAAAG 
GAACGGTGTG 
TGCTAAGATT 
TGCCTATTAT 
TTCTGCGGGA 
TACTATCTTT 
TGGGGATTTT 
TCAAGAATTT 
TCAAGCAGGT 
TGATTTTGAT 
CTATATCAAG 
TTCCATGAAA 
TGCTGATGGG 
TATTAAAACT 
CTCTAAGCAA 
TTTAGCTGAA 



3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4600 

4860 

4920 

4980 

5040 

5100 

5160 

5220 
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TATATTAAGG GGAAAACCAT TTGTTGTATC ATTTCTGGAG 
ATGCCAGAAA TGGAAGAGCG TGCCTTGATT TATGATGGTA 
AATTTCCCAC AACGTCCAGG AGCTTTGCGT GAGTTTGTAA 
GATGATATCA CACGTTTTGA GTATATCAAA CGAGCTAGCA 
ATTGGGATCG CTTTAGCAGA TAAGCATGAT TATGCAGGTT 
TTTGATCCAG CTTATATTAA CTTAAATGGT AATGAAACGC 
GGACTAATAA AAAAATATCA TACCTTCATT TTGATTTCCT 
ACACTGTCTT TAATACTCTT CGAAAATCTC TTCAAACCAC 
CAAAACAGTG TTTTGAGCAA CTTGCGGCTA GCTTCCTAGT 
AGTATAAGGT ATGATTTGAT TTCTTTTTGT TGACAAATAT 
AAGTAATTAA CTGAGCTTAT CTGTCTTGTC ATCTCTATTA 
GTGTCTGCTT CTAGGCTAGC ACCTCAATAT CCAAAGGAGT 
GGAATACCTA TCTCTCAGAT GATTTATTGA GGAAGAAAGA 
AAGGCTTGGA TTTCTAAAGG TTAGAACTAT CATCTTCAGT 
CTATCTTACG GAAATAGAGA AGCATTTTTT AAGAACTTGA 
GGTAATAATA CAGTATTTTT ATTAGCAAAT ATTTATGGTG 
ATATTATCGG ATTTAAAAAG GAAGTAAGAA A 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7939 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



GAAATAATCA 
TCAAACATTA 
ATGATATCCT 
AGGGAACAGG 
TGATTCGTAG 
TTTATAATAT 
ATCTATTGAC 
GTTAGCTCTA 
TTGCTCTTTG 
ACTATATTAA 
AGGATGGTTT 
GATGAATTTG 
TAGGAGTTTT 
TCTTAAATCG 
ATAATTTCGC 
TAGAGGCTAG 



TATCAACCGT 
CTTTGTGGTC 
GGGGCCAAAT 
CCCAGTATTA 
AATGGAAGGT 
GCTTGTCTGA 
AAGCATAGTC 
TCTGCAACCT 
ATTTTCATTG 
AAAGATATAT 
AGATAATCGG 
AAGGACATAA 
TGAGCTAGTG 
AAGAAATAAG 
ACCTTAAGAG 
CAAAACCTAT 



5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6211 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



CCGGACTCCC CACGATTCTT 


CAAAATAACT GAGTATATTT 


CTATCTTGAT 


TTTCAGATAT 


60 


AAATTCTTCC TTCTGTGGCC 


TCTTCTTACG CTTGAGAAGA 


GCTTCTCCGA 


CATGGCTTCT 


120 


TCCTTACTGA GCAAAACCTT 


GAGCATAGAT AAGTTTGACT 


GGCAAGCGTG 


CTCTTGTATA 


180 


TTTGGCTCCC TTCCCACTAT TGTGGATAGC GAGGCGTCTT 


CTCATATCAG 


TCGTATAGCC 


240 


TATATAGTAG GATCCATCAC 


GACACTCCAG AACGTACATA 


TAAGCCTTAT 


GATCCATAAT 


300 


AAATCTCTTC GATTTCGGGC 


GTATAAGAGC CATCATCATT 


GTGGACAATC 


AAAGGAGGTA 


360 
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AGACCTTAAA 
CCTTTTCTCT 
ACGTATCCAA 
ACTTGAGAAT 
GAGCCAAGAG 
GTGGATTACA 
AATCATCGCA 
CCATATCCGC 
TAGCAAAAAG 
GAAAACGTGG 
TTTGAATGAT 
ATTGTTCTTC 
AACTCTAAAC 
CATTCGTCGA 
AGGAAACTGG 
TTCATCTAAA 
ATTTTCAATA 
TAACTCAACA 
GATTTTCATC 
CCATAGCTCC 
CTAACTGACC 
AGATTTCTGC 
GGTGTTGGTT 
TGGAAAGAGC 
CGTAGATTGA 
CCTTGATAAT 
AGTTAGCCAC 
TGGTCTTAAT 
ATTCTTTTTC 
TTTTTGTAAT 



GCCACTTGTT 
TTTTGGATAA 
AATATCCAGA 
ACTCTGGGCA 
ATAATGTTCA 
CAAAATCATA 
GATGACCTGC 
CAAACGCTCC 
CCCCACTGCT 
AAATCGTGAT 
TTTGATATCT 
TTCCATGGTC 
TACTTCTTCT 
AAGGGAGCAA 
TACTTTTCTT 
TCCACTACCT 
AATCCTGTCC 
AAGGCACCGT 
TTAGTCCTCG 
TG TCTCAACA 
AAAAACCGTG 
AATCGGTTCT 
TTGCACGATA 
ACCACGGATA 
CTCGCCACCC 
ACGGTGGAAA 
TGTTTTAGGA 
AGTCGCAAGA 
TACCATACCA 
ATAATCTGCT 



GAGCCATCCT 
ACAAACTGCA 
AGTCGATCAG 
CTACGACAGA 
CTCTCGTTCA 
TCCACCTTAC 
ATTTGCTCCT 
TGAATCTCAA 
CCATTCCCAG 
AAGAGAACAC 
GTCGAAAAGA 
CTATTATAGC 
TTTTTAAATG 
AGCCGTAGTT 
CCTCCAAAGT 
GAACTTGAAC 
GAATCTCTGA 
AGGGCTGAAT 
ATTTCAATAG 
GCAGCAATGG 
TGACGGCGGT 
GGCCAACCAC 
AAGAACTGGC 
TTGTAAAGCT 
ATACCAGTTC 
ATGACACCAT 
GCATGTTCAG 
GGACCTTCTA 
AATACTTCTA 
TTTTCTTTGA 



210 
TGATCGCCTC 



GGCGCTTAGG 
GACGATGAAC 
TTTCTTCCAA 
GATTAGAATA 
TCCCCTGAAT 
CTAATCCATT 
CAGACAATAT 
CACAGAAATC 
TATCCACCGA 
GCTGGTTAAT 
AAATTCATAT 
GTGCAGGGCT 
AAAGCGGTCG 
GCGGATAGAA 
CTCTTCATCG 
AATGTGAATC 
CCCTGTAATA 
TTTCAATTAC 
CATCCAAGAC 
CTAGGTGAGG 
CACGAGTAAT 
TGCCGTTGGT 
CTTCTGAGAA 
CAGTTGGGTC 
CATAGTAGCC 
GGAAAAGCTT 
CTGTTTCAAT 
AGGCAGCAAA 
TTTTATCATG 



AATCAAAAGC 
GGCTAGATTA 
CATGGCCAAA 
ATTAGTCGTG 
AGGATTCACC 
GTGAGCAGGC 
CAAACGGACA 
CTGTGCTTGA 
CACAATCAAC 
ATAGCTAAAA 
GCGCTCTCCT 
TAACATTACA 
TCTCCAGTCC 
CTTGAAAAGC 
AGACTGGCTT 
ACTTTCAAGG 
AGCCCCGTAT 
CGCCCCTTTA 
AACATCTTCA 
AGCGTAAGAT 
TGTCCCACCT 
TTCTTTCTTA 
ATTTGGACCA 
TTCATCCTCA 
TCCACCTTGG 
ATCTTTTGAA 
GATACGTAAG 
GTCTACTTGT 
AATGCCATCT 
AGAAATTCCC 



ATATTGGCTT 
TGTCGTTTTA 
CGCCCATTAG 
ATTTCGTGTC 
TTGAAATAGG 
ATATTTTTCA 
GAGCGTTCAG 
GTACGAGTGC 
CCCTTCTTAG 
ACCTCTCTAT 
GATTTTAATA 
AAAAATATAA 
AGATTGGTAG 
GTCTCCGTCT 
TCCCTGTAAA 
TTTCATGAAT 
CACCCGTCTC 
GCTTATCACC 
ACTGGCTTGT 
GCTTCATCAG 
TGATTGGCAT 
GAATAAGGTA 
GCATTTGCCA 
AAAGATTCGC 
ATCATAAAGT 
AGAGATACAA 
TCTCCGTGAT 
GGAAAATGCA 
TCTTCTAATG 
ATGGCAACGC 



420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
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TGATTCCAGC ATAATCAAAG AGTTCCAAGT CGTTGAGACC ATCTCCAAAA ACCATGACCT 2220 

TCTCTGGTTT CAAGCCAAGG TGTTCCACAA CCTTTTCCAC CCCCGTCGCT TTGGAGCCTG 2280 

AAATCGGCAC AATATCAGAC GAATGTTGAT GCCAACGAAC CATGCGAACT TTGTCTGAGA 2340 

GACTGTCAGG CAAGTGCAAG TCATCTCCCT TATCTTCAAA AGTCCACATC TGATAGATAT 2400 

CTTCTTTTTC ATGGAAATCG GGATCTACAT CTAAGTCGGG ATAAATTGGA TTGATAGCTT 2460 

CACTCATCAT ATCGGTGCGA GTCGACAACT TGGCATCATG ACTCCCAACC AAGCCATACT 2520 

CAATTCCTTC TTGCTTAGCC CAAGAGATAT ACTCCTCAAC ATCTGACTTT TCAATCTGAT 2580 

GCTGATAAAT GACCTGACCT TTTTTATCTT CGATATAAGC CCCATTCAAA GTTACAAAAA 2640 

AGTCAGGCTT GAGATCACGA ATCTCTGGAA CAACACCAAA AATGCCACGT CCAGAGGCGA 2700 

TTCCTGTTAA AATTCCTTTT TCACGCAACT GTTTAAAAAC AGTGGGAATT GTAGTTGGAA 2760 

TAAACCCTGT CTTTGAATTC CGCAATGTAT CATCAATATC AAAAAAGACA ATCTTGATCT 2820 

TCTTTGCCTT GTATCTTAAT TTCGCGTCCA TCTCACTACC TCTTTCAATC TAACTCTTTC 2880 

CATTATATCA TAAAGTAGGC AAATCCCCTA TTTTCAAAAA GTTTATCATT TTTATTTTAA 2940 

TTTCTTGGAT GAGAAAAGAG ACATATTTAT GAAAAAGCTC CATCGTGCTT TTAATGTCTT 3000 

CTCTTGTTTT CAAACTCGTA AAAAGGGAGC CACTGATCCT AACTCGCTCT CTCATTTCAA 3060 

AGCTTGTGAA AAAAGACCCG TTGGGGTCTT AATTCGCTTT CTTGTTTTCA AGCTCATGAA 3120 

AAAGAGACCC AACTGGGTCT TTTCTTTAAT CTTCGTTTAC GAAAGGCATC AAAGCCATTA 3180 

CGCGAGCGCG TTTGATAGCT GTTGTTACTT TACGTTGGTT TTTAGCTGAA GTTCCTGTTA 3240 

CACGACGAGG AAGGATTTTC CCACGTTCTG AAACGAAACG GCTAAGAAGC TCAGTATCTT 3300 

TGTAATCAAC ATATTCAATT TTGTTTGCTG CGATGTAATC AACTTTTTTA CGGCGTTTGA 3360 

ATCCGCCACG ACGTTGTTGA GCCATGTTTT TTCTCCTTTA TAAGTTTAGT TGTCCATTAG 3420 

AATGGTAAAT CATCATCTGA AATATCCAAT GGGTTTGTTG CTCCAAATGG ATTTTCATTA 3480 

CGTGAAAAGT CTGGTACTGA ATTTGTAGGT GCTGAATAGT TTGCAGTTGG TGCAGAGTAA 3540 

GCTCCACCTG TGTGACCCTC ACGCACACTA CGGCTTTCCA ACATTTGGAA ATTCTCAGCC 3600 

ACGACCTCTG TCACGTAGAC ACGTTGTCCT TGCTGGTTAT CGTAACTACG AGTCTGGATA 3660 

CGACCTGTCA CCCCGATAAG TGAGCCTTTT TTAGCCCAGT TAGCAAGATT TTCAGCCTGT 3720 

TGGCGCCACA TAACGACATT GATAAAATCA GCCTCACGTT CACCATTTTG ACTCTTAAAT 3780 

GTACGGTTTA CTGCAAGAGT AAAAGTCGCA ACTGCTACAT TTGATGGGGT ATAACGCAAC 3840 

TCAGCGTCAC GTGTCATACG CCCTACAAGT ACAACATTGT TAATCATAGT TTACCTTCTT 3900 
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ACGCGTCAAT 
CAAACTCTTT 
CACGGAAATC 
CAGTTGCACC 
CTTCTTCAAT 
TCCTTTTGGT 
ACTATTATAC 
CACATGGGCG 
AGCTTCACGG 
AAATCCTCCG 
ACGATCCATG 
AGACCCACCG 
CTCTGGATTT 
GACAAATGTT 
ATCACCATGC 
GTCAATGGCT 
TTCTGTATCA 
AAGAGCTTTC 
CTCAGCATCC 
GAAAACTGGA 
CTGACCTGAT 
AGAATCTTCT 
GTCAAAGAAC 
CTTACGAGAG 
TGGTGTGATT 
ATAGCCAAAT 
TTGGCTCAAG 
GACAAAGTTT 
GAAAGCGATT 
ACCAACATAG 



TTTGACGATC 
AAGAGCTGCA 
TTGGATTTCG 
GTTGTCAGTC 
GTTTGGACGA 
CTAATGACCC 
TAGAAAAAAT 
TTTTCCTGTT 
ATATGTTTTG 
TGTGGAACTG 
CCAAGTTCAT 
ATAATTTCTC 
CCAGGAACTG 
GGCACACCAA 
TCAAGATGCT 
TGATCGTAAG 
CGTTCCAAGG 
ACATAAGCTT 
ATCATCCAGA 
CCAAAGTCAA 
TGGCTCAAGT 
GCCGCATTTC 
TCATAAGTTG 
CGTAgCCACA 
GGGTAGTCTT 
TTAGAACGTT 
CGTTTGATAA 
GGTTTAAAAG 
TTTCCTTTTC 
TCTTTTACGT 



ATGTGACGAA 
TCGTCATTTG 
TATGCAAGAC 
AAAATAGAGT 
ATGATATAAA 
CAAGACTTTG 
TTTTTTACGC 
CTTATGGTTT 
TTCCTGCTGC 
TACCGTATTT 
CCATCTTAGC 
CATAGCCTTC 
GTTTCATGTA 
AGTGGTTTGA 
CGTAGTCAGC 
TGATACGTTT 
TTTCCAAGGC 
CTTGCAAGTC 
ACTCAGTCAA 
AGACACGACC 
AGGCTGGCGT 
CTGAAAGAAT 
CATAGATAAT 
AGTGACGGTT 
GAGATTCACC 
CGTCCTCTTT 
CATCAAACTT 
CCACACCTTG 
CTGATTTGTT 
CAATAATCGT 



212 
GAATGTCAGC 



CTTCAACGTT 
GACGTTTTTC 
CAAAACGTGC 
GAATTTCGTA 
CAAGGGGTAA 
AAGTAAAAAC 
GATACGGTGC 
GAAGGTTACC 
ACGAAGGTCA 
GACAAGGGCA 
TGGAGCAAGC 
GAAGGCCTTG 
AATCCAAGTT 
ATCTTCATCA 
GAATGGCTCT 
TTGAGGCGCG 
AAGCGACTCA 
GTGACGGCGT 
AAGAGCCATA 
TCCGAAGTAG 
TGGGCTGTCA 
AGCGTTACGG 
ATCCATCAAA 
GATCACTTCG 
GACAATACCT 
CTCAAGTCCC 
AAAGAAGGCT 
GGCAACCCAA 
TACACGTTTT 



GTTGATTTTT 
AACGATGTGG 
CCAAGTTTTT 
TACCAAAGCG 
TTTAGCCATT 
GTGAGGTTCG 
ACTAGAATTC 
AACATACGTG 
ATACGTTCGA 
AGGTAGAATT 
TCGTAATCTT 
AAGTCTGCAC 
ATGGCTGCTG 
TCGTGTGGTG 
TTTTCATGCT 
GCAATGTAGC 
CGGTCAAGAA 
TCATGTGTCA 
GTTTTTGATT 
GCCCCTGCTT 
TCAGTTTCAA 
AACTTCATAA 
ATTTGCAACA 
AAGTCTGTTC 
ATGTCTGTGA 
GTCACATAAA 
ACTTCTTCAC 
GTTCCATCAC 
GCGCCAATCG 
GTCATTATTT 



GAAAGACGGT 
TAAAGTCCTT 
GATTCAACAA 
TTTTTAGCTT 
GATATGTTCC 
CTCACAATAA 
GAAAAAACGC 
GGAATGGAAT 
TACCGATACC 
CATATTCTGT 
CCTCACGCAT 
AAAGCACGCG 
GATAGTTCAT 
ACCCAAAGTC 
CTTGCAAGAG 
GTTTCAAGAG 
CACCTTGTAG 
AGTATGAGTA 
TTTCAGCACG 
CTAGGTAAAG 
AGAGTTCTGT 
AACCGTTCTT 
CAGCTACTTG 
CGTGTTCTTT 
TGTCCAACTC 
CAGACGTTTC 
CAAATTTTTC 
GCAATTGTAA 
TCACTTCCTG 
TTCCTTTTCT 



3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4 500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 
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TTTTTATTCT TTATGGCAAA CCACCTCTAT ATTGTTCCCA TCCAGGTCAA TCATAAAAGC 57 60 

AGCATAGTAA ATCGGATGCT CACTTCGATA ACCAGGAGCC CCATTGTCTC GCCCACCTGC 5820 

CTCTAAGCCA GCCTCATAAC AAGCCTGAAC TTCTTCCTTA TTTTCTGCTA AAAAAGCAAA 5880 

ATGAACAGGA TCTTGTGTTC CCTGAGTCAG CCAAAAATCA CCACCAGGAT GAGGGCTGTT 5940 

CGGGGATAGA AAACTAATTA GAGAACTAGT CTTAAAAGCC AATTTATAGT CCAAAGGAGC 6000 

GAGAAAACTC CTATAAAATC CTTATGAAAT TTGTAAATCC TTTACCTTAA TCTCAAAATG 6060 

ATCAATCATT CTCACTACCC ATAAATGCTT TCAAGCGTTC GACTGCTTCT TTAAGCGTGT 6120 

CTAGGTCTGT CGCATAGCTG AGGCGGACAT TTTCTGGTGC TCCAAATCCA GCTCCTGTTA 6180 

CCAAGGCCAC TTCGGCTTCT TCTAAGATAA CAGTTGTAAA GTCTGTCACA TCCGTGTAGC 6240 

CTTTCATCTC CATGGCCTTT TTGACATTTG GGAAGAGATA GAAGGCCCCT TGCGGTTTGA 6300 

CCACTTCAAA TCCTGGTACC TCTGCAAGGA GGGGATAGAT GGTATTAAGA CGTTCCTCAA 6360 

AGGCCTGACG CATGCTTTCT ACAGTATCTT GCTCACCTGA TAGAGCCTCA ACTGCTGCAT 6420 

ATTGGGCTAC TGCTGACGGA TTCGAAGTTG TTTGACCTGC AATCTTGGAC ATGGCAGCGA 6480 

TAATGTCTGC TTCTCCAACG GCATAACCAA TCCGCCAACC AGTCATGGCA TAAGTTTTAG 6540 

ACACACCATT GATGACCACT GTTTGCTTGC GAATCGCTTC CGATAGGCTA GAAATCGGTG 6600 

TGAACTCATG ACCATTATAA ACCAAGCGGC CATAGATATC GTCTGCTAGG ATGAGAATAT 6660 

CATTTTCTAC AGCCCAGTTT CCAATTGCCA AGAGTTCCTC ACGGGTGTAA ATCATACCTG 6720 

TGGGATTAGA TGGCGAATTC AGCACCAAAA CCTTGGTCTT GTCAGTGCGA GCTGCTTCTA 6780 

ACTGCTCTAC GGTCACCTTA AAGTGATTGT CTTCCTTAGC AGAAACAAAG ACGGGAACGC 6840 

CTTCTGCCAT CTTGACCTGA TCTCCATAGC TAACCCAGTA TGGGGTTGGG ATGATGACTT 6900 

CATCACCTGG ATTGACCACA GCCATAAAGA AGGTATAGAG AGAATATTTG GCTCCCGCAG 6960 

CGACTGTCAC TTGATTTGAC GCTACAGAAT AGCCGTAAAA GCGCTCAAAG TAGCTATTGA 7020 

CCGCCGCCTT AAGCTCTGGC AGACCTGAGG TTACTGTATA AAAAGAAGCA CGCCCATCTC 7080 

GAATCGATGC AATGGCGGCA TCTTGGATAT TTTTGGGAGT AGTGAAATCT GGCTCACCCA 7140 

AGGTTAGAGA CAAAATATCT CTACCCTCAG CCTTCAGTGC TTTGGCACGG GCTCCAGCAG 7200 

CCAAAGTCAC ACTTTCTTCC ATTTCTAAAA CACGGTTGGA TAGTTTCATA GGCCCTCCTT 7260 

GTTGACCAAT GCTCCTGTTT CAAAATCTAC TAGATAAAAA TCAGATCCTG ACTTAACTTC 7320 

CCAGATTGGC TTATCTTGAT AACGGCCAAA GGTTATCTTG TCAATCTCGC CAGCTCCCTT 7380 

TTCCTTAGAA ACCGTTTCTG CTTTTTCTTG TGAAACACCC TGATTTAGCT GATAAACGTA 7440 
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AATCTTATGG TCATCTTTAC CAATCAGGAC AGCAAGCGCT TCTTGCTGTT TGTTACGACC 7500 

AAGAACGCTG TAATAAGATT CCAAGCCATT GTATAAATCA ACCTGATCAG CCTGCTCTAA 7560 

TCCTGCATAC TGCTGAGCTA ATTTTTCTCC TTCACTTTTA GCTGTTTGAT AGGGTTTCAT 7620 

GCTAAGAGAA ACCATATACA GAAAGGAACC ACTGATAACC ACAAACAAAA TCGTCATCCC 7680 

TAGACCATAC TGCCACAGTA GATTATTTTT TGCTTTGTTT TGTCTTTTTT TCACTCGTCT 7740 

ATTTTACCAT CTATTAAGCT TTATTACAAG TGAATATAAG AATACTCTTC GAAAATCTCT 7800 

TCAAACCACG TCAGCTTTAT CTGCAGACCT CAAAGCTGTG CTTTGAGCAA CCAATTCTAT 7860 

TTCTCCCTTC AAACAAAACC GATTTTGAAA GTGAAACAGT TCTTACTTTT TCAGTCACAA 7920 

ATGATTAGAG TTTGCCGGG 79 39 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9897 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
{ D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 10: 

CCGCTCTACC GTCAAATAAT TACCATTTTG TTTAATACCG AAATTTTTAT CTACTGAAAA 60 

TTCAGTTGGT CTGTTGGTAC GATCGTCGTA TACAGTACCA TTCTCACGAA TAGTATAATT 120 

GTAATCAGTA TCACCTTGTT TCCTTAATTT AAGGTAATAA TTACCATCAA TTTGTTTATA 180 

ACCTGAATCT TTTCTAGTTG CTTCTCTAAA ACTTACTCCA GCAGGCATCA CATCAGCAAA 240 

CATGAGTACT TGTTTGTTCT TTTTTTCAAC AATAACAGAG TCAATATAGG TTGCACCACC 300 

GCTGATTTGT AAGTCACGTC CACCAACTTC ACGAGGCCAT TCTAATGGTA CTGGCGCAAA 360 

ATCATCGAAT GCCAATGTTA ATTTTGGTTT AGTCCATGTC TTACCATTAT CATCACTATA 420 

ACTTGTAGCA ATATTAATTT TATTCAAGAA ATCATGAGTT CCACCGTAAC GAGCGTCAAT 480 

GCTTGAAAAT ACCCGACCAT TGCTAAAAGT ATACAGAACT GGAATACGGA AATAGTTAGA 540 

ACCTGTTGTA TCATTAGCCG TATAAATTAA ATGTCCAGTA ACAGCGTTTG TTGTCATCTT 600 

TTTAACAGTT TCTTCATCCA ATGCACTATT AAAGAATTTG ATATTTTCTA GTGTTCCGTT 660 

AAAACCAAAC GCCGTTTTTC CTGCACGTTT CACTCCCCCA AGCATATAGT AATCAATACC 720 

TTTAATATCC TTGATGTTTA GGAAATTATC CACTTTCTTT TCTACTACTT TTGTACCATT 780 

TGCGTATAAA GAATATGTTT TTTTGACTGA ATCTGCTACT ACTGCAACAG TGTTAGTCAC 840 

AGCCTCTTGT TTGTACTTAC CCCAAACTGA AGCAGGTCTG GATACTAGGT TATTTTTATT 900 
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GGAAGAAGTA TCACGCGCTT CCATCCCCAA CTCACCATTG TCTCTAAGGA ACACATCTAC 
ATAACTATTT TGTTGACCGG GTTTGGAATT AGATATTCCA AACAGAGCTT GTAAGCCTTT 
CTCACTTGAC TGATTGTACT TAATCACTAC AGTAAAGTCA CCGCTAGTAA ATTTATCCTT 
TAACTCTTTA GTAACATTTT CTCCGCCCCC TGTTAAAGTA ACATTATTTT TTTCTAAGAC 
AGGAGTTTCT TCCGCTGTAG AAGATGGATC CTTAACAGTA GTTTCAACTG TTCGAGGTTG 
TACAGTAACT TCCGAAGAGT TATCCGATGT AGGTTGTACT TCCGAAATCG GAGTCGTTGG 
TGCAACAGGT TGCACCAACT TTGGTGTTGA TACTTCAGAA GTTTCAGTCT CCTGAGCTGC 
AACTGAGTTA GCAACAAATG CTGATAATAC CACTACAGTA CCTAAGGTTA CATATTGTTT 
AATATTTTTT TTCATTTTAT TTTTCCTCGT TTAAAACTTT GATAACAAGT TTTTTAACAG 
TTTCATCATT GCAATGAATC TTTGGTTGGT GAAGATCTTC TTCAAAAGTC ACCAACATAT 
TCCCTGGAAG CAATTCAACA ATTTGATAGT CTTTGCTATC GTAAAAAGCA ATATCCTTCT 
CTTCGCTAAA AGGTACACGT GACTGGGCAC GAACTGGGGA AGTTACTGCC ATTTTTTCAG 
TATTTTCAAC AACAATATGA ATATCTAAAT ATTTCTTATG AGTTTCAAAA ATATCTCCTG 
GAACTCCATC AGCTAGATAA GTCATACAAT TTGCAAAAAC ATTTTCCCCG TCAATATCAA 
TTTTTCCATC AACTAAATCT GTCAAATTTG TATTTTCTAA AAAATCACAG ACTTTTGAAA 
AATATTTATT GACAGAAGCA TATCGTTTAA AATCAGATTG TTCAGAAATA ATCATATTAT 
TTTCTCTTTT CTATTAGTGA CGAACTTCCC AACTTGAATC CGCTTTAATT TCTGTAATAT 
CATGAATCGT TGTATATTTA GGTGCAGATA CTTTATTTCC AGTAAGAACA GATACAATAT 
AACCTGAAAC TACTGATACA GAGATTGAAA TCAATGAATA TGCCCAGTAG CTAACAGCTG 
TTGGAGGAAG GAAGTATTTA ATAAATACCA TGACGATGGT TGATACAATC AGCGCTGCAT 
AAGCACCTTG TTTATTTGCT TTTTTAGAAA CAAATCCAAG AATAAATACA CCACCAAGTA 
GACCAAGTAC AAGTCCCATG AAACTATTGA ACCATTCGTA TGCAGATTTA ATATCTGAGT 
GAGCCATGAC AATGGAAACA CCAATTGAGA ATAAACCTAC TGCTAGAGAT ACGAATTGTG 
CAATTTTCGT ACGACGATTG TCTGACATAT TTTTAGAAAT GACATCTTGA ATATCCAATG 
TCCATGAAGT TGCAACAGAG TTCAAACCTG TTGAAATAGT TGATTGAGAT GCTGCATAAA 
TCGCTGCCAA GATCAAACCT GTGATACCTA CTGGTAACTG GTATGCAATA AAGTACATAA 
AGATTTGGTC TTGAGGGATA TTGCTAGCTG CACTATCTGC ATTTTGTACT TGATAGAATA 
CGTACAAGCC TGTACCAATC AAGTAAAAGA CTGTTGCAGT TGCAAGTGAC AAAACACCGT 
TTGTGAACAA CATCTTATTA AGTTTCTTAA TATTTTGTGT TGTAGTAAAA CGTTGAACCA 



960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
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AATCTTGAGA TGAAGCATAG GAAGACAAGA ^CCC TGAACCCATC ACAATTAAAA 
AGATGGAGTT TGAAAGCAAG TTAGGATCGA AAAGTTTTTC ATTTGCAGCA AGGAATTTCC 
CC^CTAA TGTTTCTGCT AC^CACCAA AGCCACC^ AATATTAGCA ATCAGTACAA 
ATAAAGCTAA AACGACACCA CTAATCAGAA TCACACCTTG AATAAAGTCT GTCCATAATA 
cgga^ag ACCACCAGTA T AAGAA T AAA CAA^GCAAC TACACCCA TC AAAATAATCA 
AAATATTGAT GTCAATTCCT GTCAATACTG ATAAACCAGC TGATGGGAGG PACATAATGA 
TAGACATACG TCCCAATTGA TAAATAATAA ACAAGAGTGC TGAAATAATA CGAAGTGCTT 
TAGAATTAAA ACGTTTATCC AAGTAATCAT ATGCCGTATC GA^ATC CGTGCAAAGA 
TAGGTAAGAT AAAAOGAATT GTCAGTGGAA TAGCTACTAC CATCCCTAAT TGAGCAAACC 
ATAAAATCCA GCTACCTGCA TAAGAGCTAC CAGCGAGTCC CAAGAAGGAA ATCGGACTGA 
CCA^cc AAAAATGGAT ACCGAAGTAA CATACCAAGG AACCGAACCA TCTCCTTTAA 
AGAACTCTTT TCCT^CATC TCTTT^AG AGAAATAGAT ACC^CAACC AACACCGCAA 
GTAAATAAAC AATCAAGATA ATTAAGTCAA TTATTGTAAA TCCTGTTGTG CCCATAACAT 
ATCTCCATAT TATTATAAAA ATTCTTTTCG ATAAGTTCTG 

^CAACTTCC AAGTCACCTT C^CCAA^ TTCTAAAGGT TGACGAACAG 
AACCTAAATC AAGTTTTTCA TTTAGACGCA AAACTTCTTT TGCTACAGCA TACATATTTG 
CCTTACCTGA TATCATCTTA TAGATAACTT CATTGATAGC ATATTGAAGT TTTTTAGCTG 
TATCTAAATC TCGW^ ATCAAACTTT CCAATTTCAA GAACAAATCT GGCATAACGC 
CATAAGTAOC ACCAATACCA CCATCAAGCG ACCACCAAGA TATTGTTCAT 

C^GACCA^T GAATACAATG TAATCTTCTC CACCTGCAGC TACAAACATT TGAATATCTT 
GTACAGGCAT AGAAGAATTT TTAACTCCAA TCACACGAGG AT^CGC ATTGTTGCAT 
ACAAACTACC AGTCAACGCA ACCCC TC CCA ATTGTGGAAT ATTATAGATA ATAAAATCTG 
TA^GACGC AGC^CC ATTGCATO.C AATATGCTGC GA^GAA^c TCPGGCAATT 
TGAAATAAAT AGGTGGGATA GCTGCAATAG CATCGACTCC AACACTTTCT GAATGTTTTG 
COURCn* ACTATCTTTC GTGTTATTAC ATGCAATATG GTTGATAACT GTTAATTTAC 
CTTTAGCAAC TTCCATAACA GCTTCAATAA TTTGTTTACG ATCTTCTACA CTTTGGTAAA 
TACA^CACC TGAAGAACCA TTTACATAGA TACCTTTTAC ACC^GTCA ATGAAATATT 
GTACCAGAGA TTTTACACGA ^n^AA TTTCACCATT TTCATCATAG CAAGCATAAA 
ATGCAGGGAT AACGCCTTTG TATTTAGTTA AATCTTTCAT CAGATTTCTC CTTTATATTG 
TTTTTTATTT GATGACATTA ATAAATCGCT GAGCAATTTC TTTTGGACGT GTAATCGCTC 
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CACCAATGAC TACACTGGTA ACACCTAAAC TATAAGCTTT TTTTAATTGT TCTGGATAAT 
GAATTTTTCt TCGGCAATTA CCGGAATATT AAAATCAGCC AATTTTTTCA TTAGTTCAAA 
ATCAGGCTCA TCTGATTGTA CACTTGTACT TGTGTAACCT GATAATGTTG TACCAACAAA 
ATCAACGCCT GATTTAAATG CATAGAGACC TTCATCTAAA TTACTTACAT CCGCCATCAG 
CAATTGATTC GGATATTTTT CTTTTATTTT TTTGATAAAT TCACTGACAA CTAAGCCATC 
ATATCTTGGT CTTAAAGTTG CATCAAATGC AATGACTGTT GTTCCGCATT CTACAAGTTC 
ATCTACTTCT TTCATCGTAG CAGTAATATA TGGTTCTTGA GGTGGATAAT CCCTTTTGAT 
AATTCCAATT ATTGGTAAAT CTACTACTTT CTGAATTGCT TTAATATCAC GCACAGAATT 
TGCGCGAATG CCCACTGCTC CTGCCTCTAA AGCTGCTTTA GCCATAAAAG GCATCAAGCT 
AAATTCTTCA TTATAAAGGG CTTCACCAGG TAAAGCTTGA CAAGAAACAA TGACTCCACC 
TTGAACTTGG CTTATAAATT TTTCTTTAGT CCAAATTTGG CTCATTTTAT TATTCCTCCT 
TATGGATAAT AGTTTGATTG TAATAATATT GTCTCTCTGG ACTTTCCAGA TAATTAGAGA 
ATAAGCAGTC TGTAATTAAA AGTATTGGAA ACTGAGGTGA TATGCGATTG CCATACGAGA 
GATGATCGGT CGAAGCTAAT AACAATAGTT CATCAAAGAA ACAATCTTCT TCGTCAAATT 
TTCTTGTAGT CATTAAAACT GTTTTAGCGC CTTTATCTGC AGCTTTTTGT AGACCTTCTA 
GTACAATATC AGTTTGACCT GAAATGGATG CTCCAATGAC AAGGCAATTT TCATTAAGTA 
GTAAGCTACT CCACAAAATC ATATCCTCGT CTGATAATAC TTCACCAATC ACTCCGAGAC 
GCATAAATCT CATCTTCATT TCTTGTAAAG CAAGAACAGA ACTTCCTTTA CCGTAGAGAT 
ATACACGCTC AGCAGTTTCT ATCATCTCAG CAATACGCTC AAGTTGAACT TCATCAAGAA 
CCGTGTAAGT TTTTCTCAAC ATTTCCTCAT AGTCGGATAA AACTTTTTCT GTTGCCTCTG 
TATATAATGC CAACTTTTCT TTCTCATGAA TCATCTCTTG GTATTTGAAA ATGAATTGTC 
TAAAACCTTT AAAACCACAT TTTTTCGCAA ATCGAGTCAA TGTTGCTTTG GATACATTAA 
GGTATTCGCA CAATGCTTTA GATGAATAAT CATTCAGAGG TTGCTGTTTT AAGAAGAATT 
TAGCAATGTC TTTTTCAGCA TATGCCATAT TTGGTAAGTT AGCTTCTATC ATTGGAATTA 
GTTCTTTTTG CAGTAACATA TGAGCTCCTT AGTTGAAGTA AACGTTTACA TTCTTTATTT 
TAACACTTTT TTTTTTTTTC AATATTTTTC ATAAATTAGA AACTAGTTTC CAATTTCTTT 
CGTTTCATAA CAGAACAACA AACATAAAAA TATAATAGTT TTTATTCTTT TTATCGTAAT 
TATATGTATT GTAAGAACGT TTATCACTAA TAATATGTTC ATATTAAAAT ATTTTAGTAA 
TATTTTATTT TGGTTTTATT ATTTCTTTTC GGAATTTCTA TATAATATTT TATTTCTAAA 



4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 



WO 98/18931 



PCT/US97/19588 



218 



AAAATTGAAA 


AAATATTTCT 


AGTTTCTTTA 


TTTTATATAG 


GTAATATATT 


TTATTTCTAA 


6240 


ATTAAAAGAG 


AATCCCATAA 


AAACTACAGA 


TTTATGAGAT 


AAATCAGGTC 


AC CT ATTTT A 


6300 


AAAAAGCAGC 


AAACTATAAA 


CTAAAAAGTT 


CCACACCAAA 


TGTAACCCCA 


TACTTCCCCA 


6360 


TAAGTCAGAT 


TTATAGCGCA 


CCATACCTAA 


AAACATTCCA 


AGTGAAACGT 


ACAGACACCA 


6420 


AGCTAGAATG 


GTTCCTGGAT 


GATGTACTAA 


GGCAAATAAA 


ACACTTGTCA 


AAGCAACTCG 


6480 


AATATCTAAT 


TTTCTAACCA 


AGTTCCATAA 


AATTTCACGA 


TACAGAAATT 


CTTCAACCAT 


6540 


ACTCGCATTG 


ATTAAGAACA 


ATAAAAATGA 


AAACCAAGGA 


ACTTGATGTT 


GAAGGCCAAT 


6600 


TAAATTTGTT 


TGATTCGTGC 


TTCCTTGAGC 


ATGAATCAGG 


CTAAAACATA 


GACTTATAAT 


6660 


CAGTAGACTA 


GCTAGTCCAA 


TACCAAGGCA 


TTTCATCCTA 


GTTTTCATAT 


TGACCTTGAC 


6720 


CACTTGTTTT 


CGTTGACCAT 


ACATCCATAA 


AAAAGAAAAA 


AGAGACGCAC 


CATAGAGAAC 


6780 


CTGTAGTATA 


GTTAACTCAC 


CGATACAAAG 


AAATTTCAAT 


AAGTATAGAG 


ATACCAATAd 


£ A Aft 


GACATTTACT 


TGTTGGAATA 


TATAAACTGG 


AATTATTCTT 


TTCATAGTTA 


CCTCCGAAAT 


fiQftft 
O 3UU 


AAATCTTCAT 


AATCTAAATC 


TAATATCTGC 


ACAATCCTTT 


CTACCCATGG 


ACTTTGARfif* 


d you 


ATTCGTTGTT 


CCATCTTGTA 


GTGGCGAATC 


TTTTGATATA 


AACGATTCAA 






TAGTGAAACT 


CTCCCGCAAA 


CATTTTTCTG 


GTTAACTCAA 


TCCAGCTGAT 


A'l u i M rP'l*I ,, PP & 
*» i 1 11. 1 Li. 


7080 


GCCAAAATAA 


TGGACAAGTT 


CTCCCAAAAT 


CGTTCAGCCA 


TATTrCTTCT 


^— I. 1 A tvj 1 1A 


1 1 a n 


GATAAATAAT 


GTGTTTGyGC 


CATGTAAATC 


AATTGTTTCG 


TATCTCTTGG 


CAATAGAGC? 


*?*? fin 


CTAGCCTCTT 


CCAAATTCAG 


ACTTGGATAA 


ACCCGCTTAT 


TTGAAACCAC 


AAAAGGAAGT 


7260 


CCGATGGTTA 


GTTCAGGATT 


TTTTAAAATT 


ATCTCAACGA 


AATCCGTTAA 


TCTTAGATTG 


7320 


TCACGGTTCT 


TAAATCGTAA 


TAAATTGGGA GATAAAAACT 


CAAAACAATC 


TGAAGAATAG 


7380 


CTCATCATCT 


CAATTAATTT 


GTCCTTTGTC 


ATTTCAGAAA 


CTGAATGACA 


AGATACCTCA 


7440 


ATGCCATAGT 


TTTGGAAGAA 


GTCTAAAAGA 


AGTTGATTTC 


WniTlf?/^^Vn » mm 

i i x u>V3i_ ± n. i i 


111 AL 1 1 AO A 


7500 


TAGAGATCAA 


TCATGGGAGA 


CCTCCAACAA 


ATTTGCTTCC 


ATTTGATATT 


CTGAGACGAT 


7560 


TAAGGAATCT 


AACAACTTTG 


AGAAGTTAAT 


CGATTTCTTG 


TCTTCATCAT 


AAGCTTTTAC 


7620 


AGTTACTTGG 


GTTGTAAGTA 


TCCCCTCTTT 


TCCCTCGGCT 


CGATAGTCTT 


GTCAATATAA 


7680 


AACAAAAACA 


AGATTCTGAT 


TATCATCTAC 


AAAGGCATTA 


ACTCCGTTCT 


TTATATCCTG 


7740 


ACTTTCAAGG 


AATTCCATAA 


CGTTTTGAAG 


ATAGGATTCA 


TAAAATAGTG 


GGTAATTATG 


7800 


TTTTTTATGG 


TAATCATCTA 


AAAATGTTAC 


CTCAAACTCA 


CATGGATAAT 


TGGGCATCAA 


7860 


AAATATTTGT 


TCATCCAGCT 


GTTTGATTTC 


TGCATCATGT 


AATTCTGTTT 


CTAATTCATC 


7920 


ACAATCTAGT 


ATTGATTCTT 


TATTTAATGC 


TTTTATCTTT 


TTCCTCTATT 


TCTTTTAATT 


7980 



WO 98/18931 



PCT/US97/19588 



TCTTTGCGAT 
GACTATTAAT 
GAAAACACTC 
AAACACCAGC 
CCATTTGACC 
ACTGAGAACT 
TTTTCGTTGC 
TTTTATCTCC 
GACCTGTCTC 
GAGTATTTAA 
TGTCTAATCT 
GAACTAAGGT 
ATGGATTCAA 
TGAAAGGAAA 
CTCTGTTTTT 
CCAACTCATC 
GGCCTTTAAC 
TTTCAAAGAA 
ATGCTTGACA 
CGTCATCTGA 
ACTTATAAGA 
GAGCTTCCAT 
TTCATACCTC 
TCCTCCTCAT 
TTCCTCAAAA 
CTTTAATGCA 
AGGTTGACTT 
CTAGACAATT 
AAAAGATGGC 



TGCGGCAATC 
AGAGACTTTT 
TTTAGGAGTG 
TACTTGGTAA 
ACTAGTTGTT 
TGGTCTTTCT 
TTCCCGTACT 
TCCTTGCATC 
CTTAAAGCTA 
AGTAAACATC 
ATCTGGTGTC 
TGGCGCAAGA 
ATTTCCTAGT 
TAAGAGTCTG 
GGGAACGCCA 
AAGTGTGGTA 
ATTTTCAAGA 
CAAAGTTCCT 
AGGGAATCCC 
AACATCTCGT 
TTTCCTAGCA 
TCCCATCCTA 
TCTCAACTAG 
GAGGTCAGTT 
GGGCAGACTC 
TCATTAACGA 
TTCTAATCCT 
TGAGGAGCTG 
GGAAGCGTTT 



ACAGGAACGG 
CTAGCAGCTT 
ATTCGTCGTA 
ACTTGTTTAT 
AACGTATTAG 
AAATTGATTG 
TTTAGAAATT 
GTAGTCAGTG 
GTCGGTAAAT 
GGCTCTTGAT 
ATACAAGGAA 
CCTTCTGAAT 
GCTTTCAAAG 
GTACCTTTCT 
AAATCCTTAC 
AGTATTGTGG 
AAAAGAAAAC 
CTAGTATCTT 
CCACAGATGA 
ATGTCATGAA 
AATTTATCAA 
AAGCCTCCTA 
ATGTAACTTA 
TTACTTTCTG 
CTCCCTTGGT 
CGCTTTTCTT 
AGAATAAAGT 
CTTGCGTCCT 
GATTGTTAAA 
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TTACACTATT 
CAAAAGCCTA 
TTCTCAAACG 
CTTCTCCTTC 
CTATACCTTT 
AATCCCCAAT 
GGATTGGTTC 
TTGGAGATAA 
CTCCAACAAC 
TTTCCTTAAA 
TCGCAACTTT 
AATAGACTTT 
TCTCAGAGTT 
TTCTAGAATG 
TGTTAAGCAC 
TGAACGTCCG 
GTGGTTGGAT 
CAAATCCCAA 
CATCGACTTT 
ATTCTATTTC 
TCTCACAAAA 
TCCCAGCAAA 
CAAAACCCCT 
CTGTTCCAGT 
TCGTCACACG 
CTAGGTGGTT 
GCTGAAAACA 
GTTCGAACAC 
GTTTGGAAGT 



ACCAACTTCT 
ATCAGGAAAG 
GTAAAATTGT 
ATAGCTAGCC 
TCCAACTCTA 
CTCTGCTTGA 
TGGAATTAGT 
GCCCTCACTT 
GACAATGCCA 
GCGTCTCCCA 
AAATCCTTCT 
ACCGCTCATT 
AGTTGCTTGA 
TCCGATAATA 
CTGCCACTCA 
TCCCTTATCG 
TTGTTTGGCC 
TCGTCTTCCT 
CCCTCTAAGT 
TCCTTCCGTT 
TCCCAAGCAC 
TAAATCTAAA 
GACCTCA7GA 
ATCGTTTTTC 
ATTTTTTCAT 
CATAAGGAAC 
ATTCGGAATA 
ATTTTCCTAC 
CACCTCCAGC 



TTATAGAGCT 
CCATGCAATC 
CCATCTATTA 
ACTACTACTC 
CCACGACGAT 
GCATATCCTT 
ATTTTGGGGA 
CCATAGACAC 
TAACGATCCT 
TTTTGTCTCT 
CCTTTACCAC 
CCACTTCTTG 
CCTTCTCGTC 
AACACCCTCT 
ACATCAAACC 
TGATTGAGTA 
GCCCGAGCAA 
GCGATTGAAA 
TTTTTAAATT 
TGAAAAATGG 
TCATGCCCTT 
ACCCAAATCA 
GCCACTTTCT 
CTCGCTAGAT 
CTCGACTGTT 
AGGAAGATTC " 
GGCATAGAGA 
CACGTGAAGA 
TAGATGTTTG 
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AGAAAAAGAT AGAGATTGTA GGCGATACAG CTCA^ATCA TACGAACTCG TTTTTGATTA 
AGGTTGAACT ATCCGTTTTA TCGCCAAAAA ATCCCTCCTT CATCTCC™ ATGAAATTCT 
CGGCTT^CC ACGTCCACGA TAAAGCTGAA ^nCTTO GCTTGTTCCG GTACCGA 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 8148 base pairs . 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CCGTGGAACA AGCCAAGACC AGTTTCAGCT TTATCGTGGA CGTGGTCAAG CCGAGAATTT 
CATCAAGGAG ATGAAGGAGG GATTTTTTGG CGATAAAACG GATAGTTCAA CCTTAATCAA 
AAACGAAGTT CGTATGATGA ^AGCTG^ CGCCTACAAT CTCTATCTTT TTCTCAAACA 
TCTAGCTGGA GGTGACTTCC AAACTTTAAC AATCAAACGC TTCCGCCATC o™ COTCA 
CGTGGTAGGA AAATGTGTTC GAACAGGACG CAAGCAGCTC CTCAAATTGT CTAGTCTCTA 
™CC™«CC GAATTGTTTT CAGCACTTTA TTCTAGGATT AGAAAAGTCA ACCTGAATCT 
iccnmcc, TA TC AACCAC CTAGAAGAAA AGCCTCGTTA A TCATCC A CT AAAGAACACT 
CGAGATGAAA AAATCGTGTG ACGAACCAAG GGAGGAGTCT GCCCTTTTGA GGAAATCTAG 
CGAGGAAAAA CGATACTGGA ACAGCAGAAA GTAAAACTGA CCTCATGAGG AGGAAGAAAG 
TGGCTCATGA GGTCAGGGGT TTTGTAAGTT ACATCTAGTT GAGAGAGGTA TGAATGATTT 
GGGTAAATAC AATGAGCTTG AAAGAAGTAG CAAACTCACC AAGCGCCAAT TCTTTGAGAA 
TCAGATGCTG GATTATACCA TCATTGCGCA TGAGAGTTTT GAAATCATCC GTCATTCTGT 
CTACCAGACA GATGATCGTG AAGTGGAAAA TGCTCTGGCT ^GAAg™ AAAATGATGA 
AACAGACAAG CTGATTCTGT TATTAAGCGA GGATATTGGT GTAGGTGAAA AATTGTGCCT 
CGTTGACGGA ACAAAAATGC GTGGAAAATG TTTAGTATAT GATAAAATAA ATGAGAGAAT 
GATTCGCTTG CAGTGCTAGA AATAGGCATT TTGAATAGTG AATATGTTAT AATAAGTATT 
AGTAGGAGGT GTTTTAGATT GGAGAAGAAA CTGACCATAA AAGACATTGC GGAAATGGCT 
CAGACCTCGA AAACAACCGT GTCATTTTAC CTAAACGGGA AATATGAAAA AATGTCCCAA 
GAGACACGTG AAAAGATTGA AAAAGTTATT CATGAAACAA ATTACAAACC GAGCATTGTT 
GCGCGTAGCT TAAACTCCAA ACGAACAAAA TTAATCGGTG Tl^A^GG TGATATTACC 
AACAGTTTCT CAAACCAAAT TGTTAAGGGA ATTGAGGATA TCGCCAGCCA GAATGGCTAC 
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AGCGATGAGG GCAATCACTA AAATCAGAGG AGGATAGATT AGAGCCACTT CTTGAGGGTA 
TTTATAGGCC AGAAGGAGTG GAATAAGATT TCCGAAAATC ATCAGATAAA AGAGGATGAT 
AAAGACTTGG TTCCCAATAC TATCGGCCTC ACGCCGTTTG TATTCGTCAA GGGGACCAGA 
AATACCGTAT GTGCGTTTGA TCAGTTTTTC AGTGAAGGTT TCTTTTTTCA TGAGTTTGCT 
CCTTTTTTAA AAATCTTCCT CCCAAAAGAG ACTGTTGAGG TCAGTTTGGA GGCTGCGGGC 
GAGATTGAGA CAGAGTTCCA AGGTTGGATT GTACTTGTCG TTTTCAATCA TATTGATAGT 
CTGTCTCGAG ACACCGATAT CCTTGGCGAG TTCGAGCTGG GAAATACCCA ATTCCTTGCG 
AAATTCTTTC ACACGATTCA TCTGTTCTCC TTTCTGATTT ATGTCGTATA TATTTGACTA 
TATTATAGTC TTTTAAACAT AAAGTGTCAA GTATTTTTGA CATATTTTTT GAAGAAATAG 
TAGTCTCCTT GTCCTATTTG TCTGACAAGT GCAAGCTGGT CGGATTTGTG GTAAAATAGA 
TAAGATATGA CAAAAGAATT TCATCATGTA ACGGTCTTAC TCCACGAAAC GATTGATATG 
CTTGACGTAA AGCCTGATGG TATCTACGTT GATGCGACTT TGGGCGGAGC AGGACATAGC 
GAGTATTTAT TAAGTAAATT AAGTGAAAAA GGCCATCTCT ATGCCTTTGA CCAGGATCAG 
AATGCCATTG ACAATGCGCA AAAACGCTTG GCACCTTACA TTGAGAAGGG AATGGTGACC 
TTTATCAAGG ACAACTTCCG TCATTTACAG GCATGTTTGC GCGAAGCTGG TGTTCAGGAA 
ATTGATGGAA TTTGTTATGA CTTGGGAGTG TCTAGTCCTC AATTAGACCA GCGTGAGCGT 
GGTTTTTCTT ATAAAAAGGA TGCGCCACTG GACATGCGGA TGAATCAGGA TGCTAGCCTG 
ACAGCCTATG AAGTGGTGAA CAATTATGAC TATCATGACT TGGTTCGTAT TTTCTTCAAG 
TATGGAGAGG ACAAATTCTC TAAACAGATT GCGCGTAAGA TTGAGCAAGC GCGTGAAGTG 
AAGCCGATTC AGACAACGAC TGAGTTAGCA GAGATTATCA AGTTGGTCAA ACCTGCCAAG 
GAACTCAAGA AGAAGGGGCA TCCTGCTAAG CAGATTTTCC AGGCTATTCG AATTGAAGTC 
AATGATGAAC TGGGAGCGGC AGATGAGTCC ATCCAGCAGG CTATGGATAT GTTGGCTCTG 
GATGGTAGAA TTTCAGTGAT TACCTTTCAT TCCTTAGAAG ACCGCTTGAC CAAGCAATl'G 
TTCAAGGAAG CTTCAACAGT TGAAGTTCCA AAAGGCTTGC CTTTCATCCC AGATGATCTC 
AAGCCCAAGA TGGAATTGGT GTCCCGTAAG CCAATCTTGC CAAGTGCGGA AGAGTTAGAA 
GCCAATAACC GCTCGCACTC AGCCAAGTTG CGCGTGGTCA GAAAAATTCA CAAGTAAGAG 
GGAAAAAGAT GGCAGAAAAA ATGGAAAAAA CAGGTCAAAT ACTACAGATG CAACTTAAAC 
GGTTTTCGCG TGTGGAAAAA GCTTTTTACT TTTCCATTGC TGTAACCACT CTTATTGTAG 
CCATTAGTAT TATTTTTATG CAGACCAAGC TCTTGCAAGT GCAGAATGAT TTGACAAAAA 
TCAATGCGCA GATAGAGGAA AAGAAGACCG AATTGGACGA TGCCAAGCAA GAGGTCAATG 
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CTCAGATTGC TGACGAGAAA AATGGTGGTT ATCTAGTCGG GTTAACCGAC 
CGGCTGTATC GATGAGTCCG GCTGAAAATC CTGATTTTAT CTTGTATGTG 
AACCTGAACA TTATTCAGGT ATTCAGTTGG GAGAATTTGC CAATCCTATC 
CTTCAGCTAT GAAAGACTCT CTCAATCTTC AAACAACAGC TAAGGCTTTA 
GTCAACAAAG TCCTTATCCT ATGCCTAGTG TCAAGGATAT TTCACCTGGT 
AAGAATTGCG TCGCAATCTT GTACAACCCA TCGTTGTGGG AACAGGAACG 
ACAGTTCTGC TGAAGAAGGG AAGAATCTTG CCCCGAACCA GCAAGTCCTT 
ATAAAGCAGA GGAGGTTCCA GATATGTATG GTTGGACAAA GGAGACTGCT 
CTAAGTGGCT CAATATAGAA CTTGAATTTC AAGGTTCGGG CTCTACTGTG 
ATGTTCGTGC TAACACAGCT ATCAAGGACA TTAAAAAAAT TACATTAACT 
AATATGTTTA TTTCCATCAG TGCTGGAATT GTGACATTTT TACTAACTTT 
CCGGCCTTTA TCCAATTTTA TAGAAAGGCG CAAATTACAG GCCAGCAGAT 
GTCAAACAGC ATCAGGCAAA AGCTGGGACT CCTACAATGG GAGGTTTGGT 
ACTTCTGTTT TGGTTGCTTT CTTTTTCGCC CTATTTAGTA GCCAATTCAG 
GGAATGATTT TGTTCATCTT GGTCTTGTAT GGCTTGGTCG GATTTTTAGA 
AAGGTCTTTC GTAAAATCAA TGAGGGGCTT AATCCTAAGC AAAAATTAGC 
CTAGGTGGAG TTATCTTCTA TCTTTTCTAT GAGCGCGGTG GCGATATCCT 
GGTTATCCAG TTCATTTGGG ATTTTTCTAT ATTTTCTTCG CTCTTTTCTG 
TTTTCAAACG CAGTAAACTT GACAGACGGT GTTGACGGTT TAGCTAGTAT 
ATTAGTTTGT CTGCCTATGG AGTTATTGCC TATGTGCAAG GTCAGATGGA 
GTGATTCTTG CCATGATTGG TGGTTTGCTC GGTTTCTTCA TCTTTAACCA 
AAGGTCTTTA TGGGTGATGT GGGAAGTTTG GCCCTAGGTG GGATGCTGGC 
ATGGCTCTCC ACCAAGAATG GACTCTCTTG ATTATCGGAA TTGTGTATCT 
ACTTCTGTTA TGATGCAAGT CAGTTATTTC AAACTGACAG GTGGTAAACG 
ATGACGCCTG TACATCACCA TTTTGAGCTT GGGGGATTGT CTGGTAAAGG 
AGCGAGTGGA AGGTTGACTT CTTCTTTTGG GGAGTGGGAC TTCTAGCAAG 
CTAGCAATTT TATATTTGAT GTAAGAATGG CACCCTGATG TTTCAGGG 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9909 base pairs 

(B) TYPE: nucleic acid 



TATATTTTCT 
ACGGTCCAAC 
TTGGAGCGGG 
GAGCAAGTAA 
GATTTAGCAG 
AAGATTAAAA 
ATCTTATCTG 
GAGACCCTTG 
CAGAAGCAAG 
TTAGGAGACT 
AGTAGAAATT 
GCATGAGGAT 
TTTCTTGATT 
CAATAATGTG 
TGACTTTCTC 
TCTTCAGCTT 
GTCTGTCTTT 
GCTAGTCGGT 
TTCCGTTGTG 
TATTCTTCTA 
TAAGCCTGCC 
AGCTATCTCT 
TTTTGAAACA 
TATTTTCCGT 
AAATCCTTGG 
TCTCCTGACC 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



22S 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
TACTCCACCC TTAATATCCG TTCCTGTAAA TACTTTACCG CTTTTAAGTT CATAGAATTG 
AACTTTTAAA TGCTTGTCTT CAAGCATCTT TTCCATCCAA TTTTTAGGAG TTTGACCAGC 
-TTTAAATAAA AACCTTGCTG GGGTGATTAG TATAGATTTA TCTGCGATTT TATAAGCTTC 
ATCAATAAAA TAGTGATATA TCGGCTCATC TCTGGCTTCT CCTGTTTCCT GATACGGAGG 
ATTTCCTATC ACGACATCAA ATTTCATTTC ACTTTCCTCG CTAGATAGGC GCTCAAAACC 
TATCATTCTA TTCTTTTTCC AGTCTTTGAT ATGGGTTTTA GATTCTTCTA CTTCTTGGAC 
TTCTAGCTCA TCCGCAAACA AACTCAATTG TTGAGATTGC TTTTGTTTAG CTGAATAAGG 
ACTACTTTTT TTCAATCCAT CCATCTGAAA GACATTGTAA GAGATAATAG TCGCAATTTC 
TTTCTTTTGC TCTAATGTTG GTTGATTTCC AGTCTTAGCT AGATAATAGT CCTCAAAAGT 
TGCCAAAAGA TTCTCACGCG CCAAAAGGAG AGAATCTCCT TGATACTCAT AACCATACGA 
AGCATGATAA GCATCTTTTA CAAGTTTATA AAATGTGACT TCATCTGAAA CCTCACGACT 
AATCCGTTGC AGTTTTCTAT CAACAAAACC AACTCGCTCA GATAATGGAA TTTCCTCACC 
AGTTACGGTA TCATATCTCG TTACCATATA AGGTGCTTCA CCACAAGTTA CCTCTAACCA 
TCGTAAGTCC ACATACTCCT CAAGACTTAA CGAGCCTAAT TTCGATTCTA CATATCCATT 
TTGCTTTGCG ACCAACCACG TTGGTGTAAA CACTTCTGCC CTTATTTTTG TCCGATCTTT 
™*TCATAT TTGGATTTTT CAGATCTGGG CTGAATCAAG TTGGCAAAGT TTCCAGTAAC 
CTTACTTGGA TTGATGCGAT CACTTGGAGC AAATCCCTTT CCTAACAATT CATAAGAATG 
CGTAnGCCAA ACAATTGATT TCTTTGTCGT TCGATCTTTT AAAAGAATTT TTAATAAGTC 
AGCCGATTCT TTAGCCAAAC TTTCTTCACT AATATCTATT GTCATCAGCA ACCTCTCTTA 
TATTGTAAGC CCTATTATAT CATATTTTAA AGAATGAAAA TTTACTTGAA AAAAGTAATT 
CAATAAATAT CTCTCCGATG ACCAACTTCT AGAGTAGCAA CGACTAATTC ATCATCTACA 
ATTTGTACGA TAACTCGATA ATTACCAATT CTATAGCGCC ATTGACCAAC GCGATTACCA 
ACCAAAGCCT TTCCGTGTCG TCTTGQGTCT TCCAAAACAT TGGTTTGTAA ATAGTTTGTA 
ATTAGCTTCT GCGTATAACG GTCCAATTTT TTCAATTGCT TGATAAAACG TCTTGTTGGA 
ACTAATTTAT ACAAATTATT CATCCTTCAA GCCTAAATCA TGCATCATTT CTTCCCAAGT 
AATGGGTTCA ACTCCTTTTT CCAAGTCTPC TAAATACTCT TGATAGGCTA AATCTGCCAC 
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ACGAGCATCG TATTCATCTT CTAGGGCTTC AAGAGTTTTG GTGCGAATAA GTTCCGAAAG 
GGAAACTCCT TCAAACTTAG CCATTGCTTT CATAAATGTT TTATCAGCTT CAGAAACTTT 
TAATGTAATA GTAGTCATCT TTTGTGCTCC CTTTTTTAAT GGTAACACCA TTGTATTACT 
TTTTAGGTGT TCAGTCAATA TAAAAAGAAC ACCTTCTCAG CGTTCTTTCT ATATCTCTGT 
CAATGGTGTT GCGGTATCTG GTGAGGTATC ATAAACCTTA AAGTCTACTC CGACTCCCAG 
ATCAGCTTGA GCCAGCTGAT TGACCATGGT CATATGAGCC AGTTCCTTGA TATTGTTTTC 
CTTAGATAAA TGCCCAAGGT AAATCTTCTT AGTACGATTT CCTAGCGTCC GAATCATAGC 
TTCAGCACCG TCCTCGTTAG AAAGGTGACC AAGGTCAGAT AGGATTCGTT GTTTGAGTCG 
CCAAGCGTAA GAACCTGATC GCAAAATCTC TACATCATGG TTGGCCTCGA TAAGATAACC 
ATCCGCATTT TCGACAATGC CCGCCATACG GTCACTGACA TAACCTGTAT CTGTCAAGAG 
GACAAAACTC TTATCATCCT TCATAAAGCG ATAGAACTGC GGTGCGACTG CATCATGGCT 
TACACCAAAA CTCTCGATGT CGATATCTCC AAAGGTTTTG GTTTTACCCA TTTCAAAAAT 
ATGCTTTTGC GAAGAATCCA CCTTGCCAAG ATATTTACTA TTTTCCATAG CTTGCCAGGT 
CTTTTCATTG GCATAAAGAT CCATACCATA CTTGCGAGCC AAAACGCCTA CTCCATGGAT 
ATGATCTGAA TGCTCATGGG TAATCAAGAT GGCATCCAGG TCTTCTGGCT TACGGTTAAT 
TTCAGCTAGC AGACTGGTAA TTTTCTTGCC AGACAAGCCT GCATCTACTA AAAGCTTCTT 
TTTTGAGGTT TCCAGATAAA AAGAATTTCC ACTGGAACCC GACGCTAAAA TACTGTATTT 
AAAGCCTATT TCACTCATTC TAGTCTTCTA CTTCATCCTC CCATACTTCT TCTTTCACTG 
CATCCTTATC ATAAGGGAGT ACAATGGTAA AGGTTGAACC CTTGCCGTAT TCACTCTTGG 
CCCAAATAAA GCCCTTATGT TGTTTGATAA TTTCTTTAGC GATAGACAGT CCTAGACCTG 
TACCACCTTG TGCACGACTT CTAGCACGAT CCACACGATA GAAACGGTCA AAGATACGTG 
GTAAATCCTG CTTAGGAATC CCCAAACCGT GGTCAGAAAT GGATAAAATC ATCTGGTCTT 
CAGTTGTCTT CATTCTGACA GTGATTTTAC CCCCATCTGG CGAATACTTA ATAGCATTAT 
TTAAAATATT GTCGACAACC TGCGTCATCT TATCTGTATC AATTTCCATC CAGATAGAAT 
TGATGGGATA ATCTCTCACC AACTCATATT TTTTCTCCTT TTCCTGTCCT TTCATCTTGT 
CAAAACGATT GAGGATAAAG GTAATAAAAG CAGTGAAGTT AATCAGTTCC ACATCTAGGT 
GACTGGTAGC ATTATCAATA CGTGAAAGAT GGAGGAGATC CGTCACCATG CGCATCATAC 
GGTTGGTCTC ATCAAGAGAA ACCTTGATAA AGTCTGGTGC TACAGTTTCA CACAAAGCCC 
CCTCATCCAA GGCTTCAAGA TAGGATTTTA CGCTAGTCAG AGGAGTCCGT AACTCATGGC 
TAACATTGGA AACAAAGAGT CTTCGTTCGC GTTCTTCCTT CTCCTGCTCC GTCGTATCAT 
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GCAAAACAGC 
CTCGAAGGTT 
GGGTAATCAA 
TATTCAGAAC 
TAATCTGACC 
TTAGCCTCTT 
CATTCAAATT 
AATAATCTCC 
CACGTCTATT 
TAAAGATAAA 
TAATACCCTA 
TTCTCACGCA 
CCCCAGACAG 
TGATACAAAA 
ACGTAGGCGT 
GCTTCCTGAC 
TGCAACTCAC 
ATAACCTTAT 
TTACGAATGG 
ATAATATCTG 
ACAACTTCGT 
TCATCTACAA 
AATAGTCAGA 
CTGCCAGATT 
CCAGCGAACT 
TACATGCCAT 
ATCAACATCT 
ACTTTCTTCC 
TTCTATAAAG 



CACCAAACCT 
CAAATATTCG 
ATCACGCAAT 
ATCTTCCTTA 
CCGACGGTTA 
ACTCTCTTGT 
ATTGGTAATA 
TGCAATCAAA 
TTCCAGTAAT 
ATCTCTGGTA 
CACCACGGCG 
GACGTCGTAC 
TCTCAAGCAA 
GCTCAAATTC 
CTGGAACAAT 
CATCTACTGG 
GATTGGAGAA 
CAAATTCACT 
TCTTAGCAAC 
GTTGCTCTGC 
AACCTTCCTT 
TTAGTATTTT 
AGACACAATA 
TTTTGTTGGG 
TCCCTATCTG 
TTTCGATGAC 
AGGTCATAGT 
GCAACCTGAT 
GGGAAATGCC 



GAAATAAAGC 
CCATTGATAT 
TCATAGTTTT 
ACCAACCCCA 
GTCGCAAGAA 
TCTAGATTTT 
TTGGTGATTT 
TCTTTAACCT 
AAGAGGGTCA 
AAAATGGTTT 
CGTCAAGATA 
AGTCACATCA 
GTGTTCGCGC 
ACGATGGGTT 
TTCTAAATCC 
CATAGGTTGA 
GGGTTTTGTT 
ATCTTTGGCT 
TTCTAAACCA 
TTCAAATTGC 
GGTCATATTA 
TTTCATATGT 
GCTAGTCTTG 
GTTTGGCAAG 
AAAAATCATG 
TAAAAACATG 
CCTGCTGGAA 
GAAAGAGGTC 
AAAAACCTGC 
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CAGACTCTCG 
CTTGGGAATC 
CTTCTATCTT 
GTTGCTTCTT 
CCCCATCTGT 
CCTGAGTGAG 
CAGACCCACC 
TTTGATTGAC 
CAACAAGGAT 
GTTTCAGTAA 
TACTCTGGTC 
ACTGTACGGA 
GTGATGACTT 
AAGTCTAGTT 
CCAATTTGGA 
GAACGACGCA 
ACATAGTCAT 
GAAAGCATAA 
TCAATTTCTG 
TCTAGCGCTT 
AACTTGATAA 
TCACCTTTTT 
GCTACTGTCT 
TGGGTAATTC 
GAAGTCACTC 
CTGGACTGTA 
ACTCTCTTCT 
AAACTGCTCT 
CAAGAGCTTT 



ACGTATCAAG 
TAGCAACAAT 
GAGCAATTCC 
GGCTGTATCG 
CATATAAAAC 
ACGAATAACC 
TTGCATATCA 
TTGCTTCAAC 
GAAACCTAAC 
ATCAAGCATT 
GGCTGGGCGT 
CATCACCAAA 
GACCTGTATG 
CTTCGCCATA 
TAGGTTGAGG 
GAAGAGCTTT 
CTGCCCCAAG 
GAATGGGCAC 
GAAGCATCAA 
CACGACCATT 
TATCCGAGAT 
CTCTACTATT 
AAGTTGGCTT 
TTGAATTCTT 
ACCTGACCTG 
TCAAAACAAA 
GGACTGGGAC 
TCTTGCGAAA 
TCGCTTTCAT 



GCAAAGCGAA 
TCTGGACTTT 
AAAATGCTTC 
TTAATCATGA 
AGAATACTAT 
TCCGACAAGT 
AGAACCTTGG 
TGAATATTAT 
AAAATCAGGA 
ATTTCTCATG 
ATCTTCAATC 
ATAGTCATAA 
CGATGCTAAA 
TTTTTTAGCC 
TTTACTATCT 
AACACGCGCC 
TTCCAAACCG 
ACTGCTTGTC 
ATCCAGAATA 
AAAAGCAGTT 
TGGTTTCTCA 
ATACCAAAAA 
GTGCATAAAC 
CTGGTGAAAG 
CTACAATCTG 
CATCAAGCCA 
CAAAGTTCAC 
AGTTATCAAC 
TTTTTTCAAG 
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TAAAAATTGT CCTTGAGAAT TTTTCACAAC TAAGGCTTTA AGATAAATAG GAACCGGCTT 
TTTCTTAGGA GATTTAATTG GATAACGGTC CATGGTTCCA TTCTGATATG CCGCACTAAA 
GTCCTTGACT GGGCTTTCTT CAGGTCTGGG ATTTACAGGA GACTCAATAT CAGACCCTAA 
GTCCATCAAG GCTTGATTAA AATCACCCGG ACGATCCGGA TTAATCAAGA TCTCCATCAT 
TGCCTGAAAA ATTTTTCGAT TACTTGGAAT CCCAATATCG TGGTTGACTT CAAACAGACG 
CGCCAAGACC CGCATGACAT TACCATCTAC AGCTGGCTCA GGCAAGTTAA AAGCAATACT 
GGAAATGGCT CCTGCTGTGT AAGGTCCAAT CCCTTTCAAG CTGGAAATTC CTTCATAGGT 
ATTTGGAAAT TGGCCACCAA AGTCAGTCAT AATCTGCTGG GCTGCAGCCT GCATATTGCG 
AACTCGAGAA TAATAGCCCA AGCCCTCCCA AGCTTTCAGT AAACTCTCCT CAGGCGCAGT 
TGCCAGACTT TCGACAGTTG GAAACCAGTC CAAAAATCTT TCGTAGTAAG GGATAACTGT 
ATCCACCCTG CTCTGCTGAA GCATGATTTC AGATACCCAG ATGTGATAAG GATTTTTACT 
TCTCCTCCAA GGCAAATCTC TTTTGTTTTC ATCATACCAA GCGAGAAGTT TCTCACGGAA 
AGAAATGACT TTCTCCTCCG GCCACATGAC GATACCGTAT TCTTTCAAAT CTAACATATC 
TCTAGTATAA CACAGAAGGT TTCACCTGTC TTTGTATCTG ATTTATAATA TTTTCAATAG 
ATAGTATATA ACTTTTCTAT CTACTTATAC TCAATGAAAA TCAAAGAGCA AACTAGGAAG 
CTAGCCGCAG GTTGCTCAAA ACACTGTTTT GAGGTTGTGG ATAGAACTGA CAGAGTCAGT 
ATCATATACT ACGGCAAGGT GAAGCTGACG TAGTTTGAAG AGATTTTCGA AGAGTATAAA 
TCTTATTGAT GAACTGCTTG CAGTCTGAGA AAAAATGAGC TTGGATATTA TTTCCAAACT 
CACTTAAAGT CAATTTCAAT CCACTAGAAC AAGCCTAGTA CAGTTCCATC GCTTTCAACA 
TCCATGTTGA GAGCTGCTGG ACGTTTTGGA AGACCTGGCA TGGTCATAAC ATCACCAGTT 
AAGGCAACGA TGAAGCCTGC ACCTAATTTT GGTACCAATT CACGAATGGT AATTTCAAAG 
TTTTCTGGTG CTCCAAGCGC ATTTGGATTG TCTGAGAAAC TGTATTGAGT TTTAGCCATA 
CAGATTGGCA ATTTGTCCCA ACCGTTTTGA ACGATTTGAG CAATTTGTGT TTGAGCTTTC 
TTCTCAAAGT TCACTTTGCT ACCACGATAG ATTTCAGTGA CAATTTTTTC AATCTTTTCT 
TGGACAGAAA GCTCATTATC ATACAAACGT TTATAGTTAG CTGGATTTTC AGCAATTGTC 
TTAACAACTG TTTCGGCAAG TGCTACTCCA CCTTCTGCTC CATCAGCCCA GACACTAGCC 
AATTCAACTG GTACATCGAT TGAGGCACAG AGTTCTTTTA AGGCTGCAAT TTCAGCTTCT 
GTATCAGATA CAAATTCGTT AATAGCTACA ACTGCTGGAA TACCGAACTT ACGGATATTT 
TCAACGTGGC GTTTCAAGTT AGCAAAACCT GCACGAACTG CCTCTACATT TTCTTCAGTC 
AGAGCGTCTT TAGCCACACC ACC ATT CATC TTAAGGGCAC GAAGGGTTGC GACAATAACA 
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ACTGCATCTG GAGATGTTGG CAAGTTTGGT GTCTTGATAT CAAGGAATTT CTCAGCACCA 
AGGTCCGCAC CAAAACCAGC TTCAGTAACA GTGTAATCAG CCAAGTGAAG GGCTGTTGTC 
GTCGCCAAAA CAGAGTTACA GCCATGAGCG ATATTGGCAA ATGGACCACC GTGTACAAAG 
GCAGGTGTAC CGTAAATTGT CTGAACCAAG TTTGGCTTAA TAGCATCCTT CAAAATCAAA 
GCCAAGGCAC CCTCAACCTG CAAATCACCT ACAGAAACAG GCGTACGGTC ATAGCGATAA 
CCAATAACGA TATTCGCCAA ACGACGTTTC AAGTCCTCGA TGTCCGTTGC CAAGCAAAGA 
ATTGCCATGA TTTCTGAAGC AACTGTAATA TCAAAACCAT CCTCACGTGG AATACCGTTT 
AGAGGACCAC CAAGACCAAC AGTCACATGG CGGAGCGTAC GGTCGTTCAA GTCCACAACG 
CGTTTCCAGA GGATACGACG TTGATCAATT CCCAGCTCAT TCCCTTGGTG CAAGTGGTTG 
TCAATCAAGG CAGAAAGGGC ATTGTTGGCA GTTGTAATAG CATGCATATC TCCAGTAAAG 
TGGAGGTTGA TGTCTTCCAT TGGCAGAACT TGTGCATACC CACCACCAGC AGCACCACCC 
TTGATCCCCA TGACTGGACC AAGAGACGGT TCGCGGATAG CAATCATGGT TTTCTTGCCA 
ATCTTGTTCA AGGCATCCGC AAGACCAATG GTAAGCGTCG ACTTTCCTTC ACCTGCAGGT 
GTTGGGTTGA TGGCAGTAAC CAAGATCAAT TTACCGACTG GATTGCTCTC AACTGCACGA 
ATTTTATCAA AGCTGAGTTT AGCCTTGTAC TTTCCGTACA ACTCCAAATC GTCATAAGAA 
ATACCAAGTT TCTCTACAAC ATCAACAATT GGCTTCAACT CAATACTCTG TGCGATTTCA 
ATATCTGTTT TCATTCAAAA TTCCTCTAAC CTCTTATATG ATAATTCATT ATATCACAAA 
ACAAGATTTT TAACATCCTA AAACTCTCTA AACGTTCGTA AATATCTCTC TTTTTAAGAC 
TTTTAGAGTC CTTTCTTAAA TTTTATATGG CTTTATAGTT TGAAACTATA ATAAATCTTC 
GTTTTTACCA AAAATTTATC ACTTTCATTT TACTTACCGC TTATTTTTGT GTACAATAGT 
GCTATGAAAA TTTTAGTTAC ATCGGGCGGT ACCAGTGAAG CTATCGATAG CGTCCGCTCT 
ATCACTAACC ATTCTACAGG TCACTTGGGG AAAATTATCA CAGAGACTTT GCTTTCTGCA 
GGGTATGAAG TTTGTTTAAT TACGACAAAA CGAGCTCTGA AGCCAGAGCC TCATCCTAAC 
CTAAGTATTC GAGAAATTAC CAATACCAAG GACCTTCTAA TAGAAATGCA AGAACGTGTT 
CAGGATTATC AGGTCTTGAT CCACTCAATG GCTGTTTCTG ACTACACTCC TGTTTATATG 
ACAGGGCTTG AGGAAGTTCA GGCTAGCTCC AATCTAAAAG AATTTTTAAG CAAGCAAAAT 
CATCAGGCCA AGATTTCTTC AACTGATGAG GTTCAGGTTT TGTTCCTTAA AAAGACACCC 
AAAATCATAT CCCTAGTCAA GGAATGGAAT CCTACTATTC ATCTGATTGG TTTCAAACTG 
CTGGTTGATG TTACCGAAGA TCATCTGGTT GACATTGCAC GAAAAAGTCT TATCAAGAAT 
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CAAGCAGATT TAATCATCGC GAATGACCTG ACTCAAATTT 
ATATTTGTTG AGAAAAATCA GCTTCAAACA GTCCAGACTA 
CTCCTTGAAA AAATTCAAGC CTATCATTCT TAGAAAGGAA 
TGGCTGTAAC GGGTTCAATC GCCTCTTATA AGTCGGCAGA 
AACAAGGCCA TCAAGTCACT GTCTTAATGA CTCAGGCTGC 
TGACACTACA GGTACTCTCA CAGAATCCTG TCCACTTGGA 
CTGATCAGGT CAATCATATC GAACTTGGAA AAAAAGCAGA 
CAACTGCTAA CACTATTGCA AAACTAGCTC ACGGATTTGC 
CAGCTCTAGC CCTACCAAGT CATATTCCCA AACTAATAGC 
TGTATGACCA TCCAGTAACT CAGAATAATC TGAAAACATT 
GATTGCTCCT AAGGAATCCC TACTAGCTTG TGGAGACCAC 
CCTCACAATT ATTTTAGAAA GAATAAAGGA AACTATCGAT 
CACCCATTGC TATCTTTTTT GCTACCATGC TCGTGATACA 
TTAACCTTTT TCCATTTCCA ATCAAACCGA CCATTGTTCA 
GCATTATTTA TGGTCCACGA GTTGGGGTTA CACTTGGATT 
TGACGGTTAA CACGATTACG ATTCTACCGA CAAGCTACCT 
ACGGAAACAT CTACTCAGCT ATCATTGCCA TCGTCCCACG 
CTTACTTAGT CTATAAACTG ATGAAAAACA AGACTGGTCT 
GTTCcTTGAC AAATACTATC TTTGTCCTTG GAGGAATCTT 
ATAATGGAAA TATCCAACTT CTTCTGGCAA CCGTTATCTC 
TGGTCATTTC TGCAATTCTA ACCCTAGCCA TTGTTCCACG 
AAAAACAGG 

12) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1126 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY": linear 



CAGCAGATCA 
AAGAAGAAAT 
AACTATGGCA 
TTTAGTCAGT 
TACAGAGTTT 
TGTCATGAAG 
TTTATTTATC 
GGACAACATG 
TCCTGCTATG 
AGAAACTACG 
GGACGAGGAG 
GAAAAAACGC 
CTTTCTGAGC 
TATTCCTGTC 
TTTGATGGGA 
CTTCTCTCCC 
TATTTTGATT 
GATTTTAGCT 
CTTCCTATTT 
AACAAATTCA 
ACTACAAACC 



GCACCGAGCT 
TGCAGAACTC 
AACATTCTCT 
TCTCTAAAAA 
ATCCAACCTT 
GAACCCTATC 
GTGGTACCTG 
GTAACCAGTA 
AATACAAAAA 
GCTATCAGCT 
CTTTAGCTGA 
TCTAATATTG 
TCACTTATCT 
ATTATTGCCA 
TTACTTAGCT 
TTCGTACCAA 
GGTTTAACTC 
GGAGCCCTTG 
GGAAATGTTT 
ATTGCTGAAT 
TTGAAAAAAT 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
TAATTTTCAT ATAATAGTAA AATAGAATGT GTGATTCAAT AATCACCTCA AATAGAAAGG 
AAATTCTATG TCAAATCTAT CTGTTAATGC AATTCGTTTT CTAGGTATTG ACGCCATTAA 
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TAAAGCCAAC TCAGGTCATC CAGGTGTGGT TATGGGAGCG GCTCCGATGG CTTACAGCCT 
CTTTACAAAA CAACTTCATA TCAATCCAGC TCAACCAAAC TGGATTAACC GCGACCGCTT 
TATTCTTTCA GCAGGTCATG GTTCAATGCT CCTTTATGCT CTTCTTCACC TTTCTGGTTT 
TGAAGATGTC AGCATGGATG AGATTAAGAG TTTCCGTCAA TGGGGTTCAA AAACACCAGG 
TCACCCAGAA TTTGGTCATA CGGCAGG GAT TGATGCTACG ACAGGTCCTC TAGGGCAAGG 
GATTTCAACT GCTACTGGTT TTGCCCAAGC AGAACGTTTC TTGGCAGCCA AATATAACCG 
TGAAGGTTAC AATATCTTTG ACCACTATAC TTACGTTATC TGTGGAGACG GAGACTTGAT 
GGAAGGTGTC TCAAGCGAGG CAGCTTCATA CGCAGGCTTG CAAAAACTTG ATAAGTTGGT 
TGTTCTTTAT GATTCAAATG ATATCAACTT GGATGGTGAG ACAAAGGATT CCTTTACAGA 
AAGTGTTCGT GACCGTTACA ATGCCTACGG TTGGCATACT GCCTTGGTTG AAAATGGAAC 720 
AGACTTGGAA GCCATCCATG CTGCTATCGA AACAGCAAAA GCTTCAGGCA AGCCATCTTT 780 
GATTGAAGTG AAGACGGTTA TTGGATACGG TTCTCCAAAC AAACAAGGAA CTAATGCTGT 840 
ACACGGCGCC CCTCTTGGAG CAGATGAAAC TGCATCAACT CGTCAAGCCC TCGGTTGGGA 900 
CTACGAACCA TTTGAAATTC CAGAACAAGT ATATGCTGAT TTCAAAGAAC ATGTTGCAGA 960 
CCGTGGCGCA TCAGCTTATC AAGCTTGGAC TAAATTAGTT GCAGATTATA AAGAAGCTCA 
TCCAGAACTG GCTGCAGAAG TAGAAGCCAT CATCGACGGA CGTGATCCAG TCGAAGTCAC 
TCCAGCAGAC TTCCCAGCTT TAGAAAATGG TTTTtCTCAA GCAACT 
(2) INFORMATION FOR SBQ ID NO: 14: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2520 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

CCGGCAACAA AAAAGAAAAA ATCAACAGTT AAAAAAAATC TAGTCATCGT GGAGTCGCCT 60 

GCTAAGCCAA GACGATTGAA AAATATCTAG GCAGAAACTA CAAGGTTTTA GCCAGTGTCG 120 

GGCATATCCG TGATTTGAAG AAATCCAGTA TGTCCGTCGA TATTGAAAAT AATTATGAAC 180 

CGCAATATAT TAATATCCGA GGAAAAGGCC CTCTTATCAA TGACTTGAAA AAAGAAGCTA 240 

AAAAAGCTAA TAAAGTTTTT CTCGCGAGTG ACCCGGACCG TGAAGGAGAA GCGATTTCTT 300 

GGCATTTGGC CCATATTCTC AACTTGGATG AAAATGATGC CAACCGTGTG GTCTTCAATG 360 



1020 
1080 
1126 
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AAATCACCAA 
TGGTCGATGC 
CTATTTTGTG 
TTAAACTCAT 
CAGTTGATGC 
ATGGTAAAAA 
CGAGTAAAGA 
TACCCTATAC 
GAAAAACCAT 
AAGGTTTGAT 
AGGCGGCAAG 
TCAAAAACGC 
ATACACCAGA 
TCTGGAATCG 
AATTGTCTCA 
ATCTTGCCAT 
ATGTGGTCAA 
ATTCTGAAGC 
ACGCGCCAAC 
TTGAACCGAC 
TCGTAAACGT 
AAGAGCAGTG 
AGGCTGAAGA 
AAGTGTGTGG 
GCAATTTCCC 
CAAGCTGTCA 
GTTGCAATCG 
GTCCAAAATG 
TTTGTAGCAA 
AAGTCAGCTA 



GGATGCAGTC 
CCAACAAGCT 
GAAGAAGGTC 
CATTGACCGT 
TGTCTTTAAA 
GATGAAACTG 
CTTTTCAGTA 
CACTTCATCT 
GATGGTTGCC 
TACCTATATG 
CTTCATTACG 
ATCAGGTGCT 
AAGCATCGCT 
TTTTGTGGCT 
AAAAGGGGTT 
TTATAATGAT 
ACAGGTCAAT 
AACACTGATT 
CATTGAAACC 
AGAGTTGGGA 
GACCTTCACA 
GCGACGGGTC 
AGAAATGGAA 
CAGTCCAATG 
AGATTGCCGT 
TCAGGGACAA 
CTATCCAGAA 
TGGCAACTTC 
AGGCGACTAC 
AGCTCGAGAA 



AAAAATGCTT 
CGTCGGATCr 
AAGAAGGGCT 
GAAAATGAAA 
AAGGGAACCA 
ACCAGCAATA 
GATCAGGTGG 
ATGCAGATGG 
CAACAGCTCT 
CGTACCGATT 
GATCGTTTTG 
CAGGATGCCC 
AAGTATCTGG 
AGCCAGATGA 
CAATTTGCTG 
TCTGACAAGA 
AGCAAACCAG 
AAAACCTTAG 
ATTCAGAAAC 
GAAATTGTCA 
GCTGAAATGG 
ATTGATGCCT 
AAAATCCAGA 
GTCATTAAAC 
CATACCCAAG 
ATTATTGAGC 
TGTGAATTTA 
CTCATGGAGA 
GAGGAAGAAA 
AGGACAAATT 



232 
TTAAAGAACC 



TGGATCGCTT 
TGTCAGCAGG 
TCAATGCCTT 
AACAATTTCA 
ACGAAGTCAA 
ATAAGAAAGA 
ATGCTGCCAA 
ATGAAGGAAT 
CGACTCGTAT 
GTAGCAAGTA 
ATGAGGCTAT 
ACAAGGATCA 
CAGCGGCCGT 
CCAATGGTAG 
ATAAGATGTT 
AGCAACATTT 
AGGAAAATGG 
GTTATTATGT 
ATAAGCTCAT 
AAGGTAAACT 
TTTACAAACC 
TTAAGGATGA 
TTGGTCGTTT 
CAATCGTGAA 
GAAAAACCAA 
CCTCTTGGGA 
AAAAAGTCCG 
AGATGGCTCT 
TTGTCCTTTC 



TCGTAAGATC 
GGTAGGGTAT 
TCGCGTTCAG 
CCAGCCAGAA 
TGCTTCCTTC 
GGAAGTCTTG 
GCGCAAGCGC 
TAAAATCAAT 
TAATATCGGT 
CAGTCCTGTA 
TTCTAAGCAC 
TCGTCCGTCA 
GCTTAAGCTA 
TTTTGATACC 
TCAGGTTAAG 
ACCX3GACATG 
CACCCAACCG 
GGTTGGACGT 
TCGCCTGGCA 
CGTTGAATAT 
GGATGATGTC 
ATTCTCTAAA 
ACCAGCTGGA 
TGGTAAATTC 
AGAGATTGGT 
GCGTAATCGC 
CAAGCCTGTT 
TGGTGGTGGC 
TTGTCAACTG 
TTTTTTGATA 



GATATGGACT 
TCGATTTCGC 
TCCATTGCCC 
GAATACTGGA 
TATGGAGTAG 
TCTCGTCTGA 
AATGCTCCTT 
TTCCGTACTC 
TCTGGTGTTC 
GCGCAAAATG 
GGTAGCAAGG 
AGTGTCTTTA 
TATACCCTTA 
ATGGCTGTTA 
TTTGATGGTT 
GTTGTTGGAG 
CCTGCCCGTT 
CCATCAACCT 
GCCAAACGTT 
TTCCCAGATA 
GAAGTTGGAA 
GAAGTTGCCA 
TTTGACTGTG 
TACGCTTGTA 
GTTGAGTGTC 
CTATTCTATG 
GGTCGTGACT 
AAGCAGGTTG 
TAGTGGGTTG 
TTCAGAGCGA 



420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
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TAAAAATCCG TTTTTTGAAG TTTTCAAAGT TCCGAAAACC AAAGGCATTG CGCTTGATAA 2220 

GTTTGATGAG ATTATTGGTC GCTTCCAATT TGGCGTTAGA ATAGTGTAGT TGAAGGGCGT 2280 

TGACGATTTT CTCTTTGTCC TTTAGAAAGG TTTTAAAGAC AGTCTGAAAA AGAGGATGAA 2340 

CCTGCTTTAG ATTGTCCTCA ATGAGTCCGA AAAATTTCTC CGGTTCCTTA TTCTGAAAGT 2400 

GAAACAGCAA GAGTTGATAG AGCTGATAGT GATGTTTCAA GTCTTGTGAA TAGCTCAAAA 2460 

GCTTGTTTAA AATCTCTTTA TTGGTTAAAT GCATACGAAA AGTAGGGCGA TAAAAATGTT 2520 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 10993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



180 
240 
300 
360 
420 



Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
TTTTCTCGAT AATAACTTCC ACCTTATTAT TTGGGATACC CTCCTCTTCT TCACCACCAC 60 
GTTCATAGTA GTCATCGCGA TAGAGAAAAG CTACGATATC AGCGTCCTGC TCAATAGACC 120 
CAGATTCACG AATATCAGAC AAGACCGGTC TCTTGTCCTG ACGTTGTTCT ACACCACGAG 
AAAGCTGACT CAGAGCGATT ACTGGAACCT TCAATTCCTT GGCTAGTATT TTCAACTGAC 
GAGAAATTTC AGAAACTTCT TGTTGACGAT TTTCTCGACC AGTTCCCGTG ATAAGTTGCA 
AATAGTCTAT CAAAATCAAA CCAAGATTTC CAGTTTCTTG AGCCAATTTA CGAGAACGAG 
AACGAATCTC TGTAATCCGA ATACCTGGCG TATCATCGAT ATAGATACTG GCGTTAGcTA 
GATTACCCTG AGCAATAGTA TATTTTTGCC ACTCCTCATC TGTCAATTGC CCTGTACGGA 480 
TAGAATGTGA CTCCACTAAG CCTTCTGCAG CTAACATACG ATCTACCAAG CTTTCCGCAC 540 
CCATTTCGAG TGAAAAAATA GCAACCGTTT TGTCCAACTT AGTCCCAATG TTCTGAGCGA 600 
TATTCAAGGC AAATGCTGTC TTACCAACTG CTGGACGAGC TGCTAAGATA ATCAACTCCT 
CCTCATGAAG TCCTGTTGTC ATATGATCCA AATCACGATA ACCTGTCGCA ATACCTGTAA 
TATCGGTCGT TTGTTGCGAG CGAGCTTCCA GATTTCCAAA GTTGAGATTC AACACATCTC 
GAATGTTCTT AAACCCGCTT CGATTTGCAT TTTCACTGAC ATCAATCAAC CCTTTTTCTG 
CCTGAGCAAT AATTTCATCA GCTGGTTGTG ACGCTTCGTA AGCTTGGTTG ACAGACTCTG 
TCAACTTGGC AATTAAACGA CGTAGCATTG CTTTTTCTGC AACAATCTTA GCATAATACT 
CCGCATTAGC AGAAGTTGGC ACAGAATTAA CAATCTCAAC CAAGTAAGAC AAGCCACCAA 



660 
720 
780 
840 
900 
960 
1020 
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TATTCTCTAA ATCACCTTGA TTATCAAGGA TAGTACGAAC CGTTGTTGCA TCTATGGCAT 
CACCACGATC GGATAAATCG ACCATGGCTT GGAAAATCAA ACGATGGGCA TACTTAAAAA 
AGTCCCGAGA CTCAATGTAT TCTCGCACAA AAACAAGTTT ACTCTCATCA ATAAAGATAG 
CCCCTAAAAC GGATTGCTCA GCTAAGATAT CTTGAGGTTG TACTCGTAAC TCTTCTACTT 
CTGCCATCAG ACTTCCCTTC CTTTTACAAT CTTGTCAAGA AGGTGTAAAC TTATCCTTCT 
TTCACACGAA GATTGATTAC ACTTGTGATA TCTTGATAGA TTTTCACTGG CACATCAATC 
AAACCAACCG CTCGAATCGG AGCTTGTACT TGAATATGAC GTTTATCAAT CTTAATTCCA 
AATTGCTTTT GCAATTCTTC TGCAATCTTC TTATTGGTAA TAGAACCAAA GGTACGACCA 
TCTGGACCAA CTTTTTCAAC AAATTCTACA ACAGTTTCTT CTGCTTCAAG TTGTGCTTTA 
ATTGCTTTTC CTTCTGCAAT CATCTCAGCG TGAGCTTTTT CTTCCGATTT TTGTTTACCA 
CGAAGTTCAC CTACAGCTTG AGCAGTCGCT TCTTTGGCTA GATTCTTTTT GATAAGAAAG 
TTTTGCGCAT ACCCTGTTGG TACTTCCTTA ATTTCGCCTT TTTTACCTTT TCCTTTAACA 
TCTGCTAAAA AGATTACTTT CATTCTTCTT TCTCCTTTTC CTTCATTTCA TTTAATACAA 
TTTCTGTCAG TTTTTCACCT GCTTCTGACA AGGTTACATC TTTAATTTGA GCTGCTGCCA 
AATTAAAGTG GCCTCCACCG CCTAACTCTT CCATAATCCG TTGTACATTC AGTTTACTAC 
GACTTCGAGC TGAGATAGAG ATAAATCCTT GTGTATTCTT CGCAAGAACA AAACTCGCTT 
CAATACCTGA CATGGCTAAC ATGGCATCTG CTGCCTTACT AATAACAACT GTATCATAGC 
ATTTCATGTC CTTAGCCTCT GCTATTAGTA CATCTGAACC TAATTTACGC CCCTGTAAAA 
TAAGTTCATT GACCTCACGA TATTCTTCAA AATCTGTCGC AGCGATTTCC TGGATAGCAA 
TACTATCACT TCCGCGCGTT CTGAGATAGC TAGCAACATC AAATGTCCGA CTAGTTACTC 
GCGAGGTGAA ATTTTTAGTA TCCAACATCA TACCAGCCAT CAAGACACTT GCTTGCATAC 
GACTCAAACG ATTTTTCTTA GAATTCTGGA ACTGAATCAA TTCCGTTACC AACTCACTGG 
CACTACTTGC ACCACTTTCG ATATAAGTAA TAACCGCATT ATCTGGAAAA TCCTGATCCC 
TTCTATGGTG GTCAATAACA ATGGTTTGGG TAAATAAATC ATAAAATTCT TTTGATAATG 
TTAAGGCTGT CTTTGAATGG TCTACAAGAA TCAACAAAGA ACGATTGGTC ACCATCCCCA 
TTGCATCCTT AACAGACAAC AACTTCGTAA CTCCTTCTTT TTCTATGAAT GAAACAGCTC 
GTTCAATATC TGGAGACATT TGTTCTTCAT CATAAAGAGC ATAGCTATTT TCAATCACAT 
TGCTGGCGAA CAACTGCATA CCTACAGCAG AGCCCAAAGC ATCCATGTCT AAATTTTTGT 
GACCGACTAC AAAAACCTGA TCTACACTCC GAATCTTATC TGAAATAGCT GTCATCATAG 
CGCGCGTACG AGTCCGTGTA CGCTTGATTG AAGCAGCAGA CCCACCACCA AAATAAACTG 



1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 
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1980 
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2760 

2820 



WO 98/18931 



PCTYUS97/19588 



GATTTTTCGT 
AGTTCAAATT 
ATCCCATACT 
TAACAGAAAA 
ATCGATCCAT 
-AATTAGCTAC 
CATCATAATT 
TTATGGCTTG 
GAACAGCATA 
TGATAATCGT 
CAGCATAGGG 
CAGGCATCTG 
TCTGTTCTAC 
CTCCTACAAA 
AAGTGGATAA 
AAAATTTTTT 
ACTTTTTAAA 
AGTAAACCTA 
CGCCATCCTC 
AGAAATCATG 
AATCAATTTG 
CATCTGTAAA 
TATTTTTACG 
CTTGAGATAC 
TAGCCGTTTT 
GGAAGTACTT 
CTTCTAGGTT 
TTTTGTTTTT 
CTTACAAGGG 



TTCGTCGTTT 
GAGCAAAGCA 
TAAGGTCAAG 
TTTATCATTC 
ACTTACCCGA 
AAAACTATTG 
ATCCACAGAG 
TTCCCTGGAT 
ACGCTTCTCA 
TTGAACAGCT 
ATTAAACCAC 
TTCCAATAGA 
ATCACTCCTT 
AAGAAACAAA 
GACTCCAAAC 
CATTCAAAAC 
AGTGTAATCA 
CCTTTACACA 
ACGACGATAA 
CCCCAATAAA 
TTTTGAACGA 
TAATTGACCA 
AATCTGACGT 
ATCTTCTGCG 
TTCACGATAA 
TTCGATCTTT 
TTCACCACGG 
ATGATTTTAT 
AAAATGTTTT 



TCCTTAACAA 
ACTTTCCCTA 
GGCAACTGTC 
ATCAAGCCCT 
CGAGAAAACA 
ATTTGACTAA 
ACAATCCCAA 
ACATCTACAA 
CCAAGCTTGG 
TCTAAATCAA 
TCAACCTCTC 
GCTGTCAAAC 
GTATAATGCA 
ATTAAAACCG 
GCAATCAATC 
CTCTTGGCAC 
GTAATTCTAT 
TATTGAAATT 
ATCACATTGG 
TCCATTTGTA 
ACAACTTTAG 
GTTGCTACCT 
TCAATTTTAT 
CGGAGAGTAA 
ACTTTTAAGT 
TCGAGTTTAG 
ATACTATATT 
TATAACGCTT 
TACATCCTTA 



235 
CCACCTGGTC 
TCTCATCATG 
TCTGTTTCGA 
CAAGCACCGT 
TCATGTGTTT 
TATCTGACTC 
TCACTGGTCT 
AATACAAAAC 
CATAAGTAGA 
AATCACCATC 
CAGAAGATAA 
TTTCTTCCGC 
CTCTCAGTTT 
TCAACAGATT 
CTACTAGAAT 
CCATTATACC 
CAATTATAAG 
AAGATTCTTT 
TTGTCTGATC 
GAATTGCTTC 
ACTGGACAAT 
TATTTTTATT 
CAGTTACAAG 
TAGATCCAAG 
TAATTCGGGC 
AAACTACATA 
TAATCATATG 
TCATTCTATT 
GCACCAGCTT 



GCCACCACGT 
ATTTCCATCG 
CTCTTCTCTG 
GTAGTCAGTA 
TTCTGAAAAC 
AGAAGTTTCA 
ACTTGTTACC 
ACCGGAAGAA 
CGGATTTCCT 
TTCCTTGGTC 
ATTCAATTTC 
TTGGTGGTTT 
CTTAAATAAA 
ATTATTAACA 
AGGAAAAATT 
ATAATACCCC 
AAAAAGGTAG 
AACCTCTAAC 
TTCAACATCC 
TTCCAAATCC 
ATTTGAATCT 
TTTACGCTCG 
GTCAATTGAA 
CGGAATCGTT 
ATCCAACTCT 
ATCACGAATT 
AGTACCTTCT 
TTTGCAAATT 
CTTCCAACAG 



ACTTCAGCCA 
CCATAAGAAA 
AAAGCATCAA 
AATAGATAAA 
TCTGATATAA 
TCCTCCAAAT 
AATTCATCTG 
GCATCCATAT 
ACTGAAGCCT 
AAAATCAATT 
ATAACACCTA 
ACATACTGTA 
AAAACATAGC 
AAAATAATGA 
GGACTTACAT 
TCAAAAAGCG 
TTTACAATTC 
AAACCAATTT 
ACATAGATAA 
ATTGGTTTTA 
TCCACCAAAG 
ATTTTTGTTT 
CCATACATAT 
ACTTCCACTT 
TGTTCTGGTT 
GCTTCTGTTA 
TTCTAAACAT 
TTTTCCTCAT 
TTTCTTAACA 



2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 
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3900 
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4020 
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4200 

4260 
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CGATTTATAG TTGCTCCTGT AGTATAGATA TCATCTATAA GTAGGATTTT TTTAGGAATA 4620 
GTGACTCCAC TTTTAATAAA GAAAGGAAGT TCTGTCCCCA AGCGCTCTGA ACGATTTTTA 4680 
GAAGAACTGG CTCTCTCTTC TCTTTTCTCT AATAAATCCA GATACTCAAA GCCTGCTGCC 4740 
TCTACCAAGC CCTCAACCTG ATTAAATCCT CTATTAGCAT ATCTATCAGG ACTTAGGGGA 4800 
ATTACAACAA ATTGATACTC TTTGTACTTT TTCAACTCCT CACTTAAAAA TGAAGCGAAA 4860 
ACTTTTCTTA ACAGGAAGTC TCCATCAAAC TTATACCGAC TGAAAAAATC CTTCATAGCT 4920 

TGATTGTAAG TAAAAATCGC TCTATGACTG ACTTCAACTC CCTCTTTACA CCAAAGTTGA 498 0 

CAATCTTGAC ACTTTGTTGA CAACTCTGTT TTCATACAAT TTGGACAGTT CTCTTCCCCA 5040 

ATTCTTTCAA AAGTAGAATC ACAGTCTGAA CAAAGACAAG AGTCATCATT CCTCAGAAGT 5100 

AAGAGACTAC TAAAAGTTAA AACAGTCTTC ATAGTCTGCC CACATAACAA GCACTTCATA 5160 
GACCAGCCTC CTTATTCATC ATCTGAATTT CCTTAATCGC CTTCTTGATT GAAGCATTTA . 5220 

ACCCATCATG GAAGAAAAGC AAATCTCCTG TCGGTCTATC CATGCTTCGT CCAACTCGTC 5280 

CACCAATCTG AATCAAACTA GACTTGGTAA ACAAACGATG ATTGGCCTCT ACTACGAAAA 53 40 

CATCCACACA AGGGAAGGTA ACTCCGCGCT CCAAGATTGT CGTACTGATA AGTATTGTCA 5400 

GTTCTCCATC TCGAAAAGCT TGTACTTGCT CTAATCGATC CTCTGTTACA GAAGATACAA 5460 

AGCCAATTTT CTCATTTGGA AATTGCTCCT GTAAGATTTC TGCTAACTGC TCCCCTTTCT 5520 

TAATTTCTGA AGCAAAAATG AGTAACGGAT AAGCTGTCTT TCTCTGCTTC TCAATATAGG 5580 

ACTTTAACTT TGGTGACAAA CGATTCTTGT CTAAGTAGCG ATTAAAATCC GATAACCAAA 5640 

TTGGTTTTGG AATAATCAAC GGATTTCCAT GAAACCGTCT CGGTAAATTC AGTCTTTTTA 5700 

GTTCTCCTAA ACGGACCTTT TTATCTAACT CATTGGTCGA AGTCGCTGTT AAAAAGATTC 5760 

TCAATCCATT CTCCTTTACA CTATTCTTGA CAGCGTGGTA AAGCATGGGA TTATCAACAT 5820 

AAGGAAAAGC ATCTACTTCA TCCACTATCA GCAAATCAAA AGCTTGATAA AACTTCAATA 5880 

ACTGATGGGT TGTTGCAACA ACTAGTGGTG TTCGAAAATA AGGTTCCGAT TCTCCATGTA 5940 

GCAAAGCTAT CCCGCAAGAA AAATCCTGTT GCAGGCGCTT GTACAGCTCC AAACAAACAT 6000 

CTATGCGAGG ACTAGCCAAA CACACTGCAC CACCCGCATT GATCACTTTA GCCACTACTT 6060 

GATAAATCAT TTCTGTCTTT CCAGCTCCTG TTACCGCATG AACTAAGGTT GGCTTTTGCT 6120 

TGTCTACTAC TTGAAGCAAT CCCTCTGACA CCTTCTCTTG AAAAGGAGTT AATTGGCCGC 6180 

GCCATTTGAG AACATCTTGC TTTGGAAAAT CCTCCTGCGG AAAATAGTAT AAAGTTTGAT 6240 

CACTTCTGAC TCGCTTCATC AGCAAGCACT CTCGACAATA GTAAGCACCG ATGGGCAAAT 6300 

ACCATTCTTC TAGAATAGTA CTATTACAGC G1TGACAGAA AAGTTTCCCC TTCTCCTTTC 6360 
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TCATTGCTGG 
ATAAACGACC 
TTTAGATGAT 
AAGTCCAAGA 
AAGAAGAGGC 
ACTGCTCTGC 
AGCCTAGTGG 
ATGTCTGTGT 
TTCGTGCTTA 
AAGAACAGGC 
TCCTTAAAGA 
TGATTTATGT 
ATGGAAAAGT 
TGTAAACAAT 
ATAAAAAGAG 
CGTTTTCCTT 
CTTTGATCTA 
AAGAGGTATT 
TGTTAAACTT 
ATTTAATCCT 
ACAAGATGGT 
TATTGGACTT 
AACTATGAGT 
TCCTGGTAGC 
TGATGGTTTC 
AACAGGAGCT 
AGTAGGTACT 
CATTCAAGTT 
TCCTCACAAA 



AAGTTTCTCC 
GAGATAATCT 
TTTTTAGTAC 
AGAAATCAAA 
TCGTGACTTC 
CTTCATTATT 
TACTGCTGGT 
GGTCGTGACA 
CGCCGGCAGT 
TGGCATTGCr 
ACATGGTCTC 
TGATAAAGAA 
CACTTTAACT 
GAATAATACA 
GCGTACCAAA 
CACACCTATT 
TGATATATAG 
CATATGTCTA 
AACAACATCG 
GGTTCATCTG 
ATTCTGAAAC 
TCATGGGTAG 
GTAGAACGAC 
GAGGGAATGA 
CTTCCTCTTC 
GAGATACTAG 
GGTGGAACGA 
TTTGCAGTAG 
ATTCAAGGTA 



GCCAACTGAC 
AAATTTACTT 
AATTAAATCA 
AAATCTCGCT 
ATTACTGCCA 
GGAGAACGTA 
GTTCCCATGC 
CGCTACTTTG 
GTCGCCTTAG 
ATTCAAATGT 
ATGGAGCTGG 
GAAAAAGAAA 
GACCAAGGTT 
GCGTTTCGTT 
ATATACTAGA 
TACTAGAATT 
AAATGGTATG 
TTTAPAACAA 
TGCCAGAAGG 
TAAAAGACCG 
CTGGTTCTAC 
GTGCTGCTAA 
GTAAAATTAT 
AAGGTGCTAT 
AATTTGACAA 
CTGCTTTCGG 
TTTCTGGTGT 
AAGCAGATGA 
TCTCAGCTGG 



237 
GTTCTTCTTC 
TCATACTTCT 
TGGAATTTAG 
TTATCTGCCA 
TCAAAAAAGA 
GTGAAATTAA 
TTGGGGTACT 
GTGGTATTAA 
CTGTCAAAGA 
CTTATGCTCA 
ATACAAACTT 
CTATTAAAGC 
TACGAGAGGT 
GACATTCTCA 
AAATGAAGCA 
AGCTGAACGC 
GATAGCGTTA 
CATTACTGAA 
TGCTGCAGAC 
TATTGCCCTT 
TATTGTTGAA 
AGGGTATAAA 
CCAAGCTTAT 
TGCTAAGGCT 
TCCAGCTAAT 
TAAAGATGGA 
TTCTCATGCA 
ATCTGCTATT 
ATTTATTCCT 



TGTTAATTCA 
TTATTCGTAA 
GACAATTAAA 
TGCCAAGCGT 
ACACTACAAA 
ACGTACAAGT 
AGAAAATCAC 
ACTAGGCGCT 
AATTGGTATT 
GTACCAAGAG 
TACAGATCAA 
TGCACTTGTG 
TGAAGTTCCT 
CAACTACTTT 
ATTCAAACGA 
AATCACTTPGA 
TACTAAAGAT 
TTAATCGGTC 
GTCTATATAA 
AGCATGATTG 
GCAACAAGTG 
GTCGTCATCG 
GGTGCTGAAC 
CAAGAAATCG 
CCAGAAGTAC 
TTAGATGCCT 
CTCAAATCAG 
CTATCTGGTG 
GATACACTTG 



TTCTCAGTAA 
AAACTAGCAC 
GAGGACGGTC 
GTTTATAGCG 
GCGACACATA 
GATGATGGTG 
AATCTCACCA 
GGAGGACTAA 
ATTGAAATAA 
TACAGTAACT 
GTCGATACGA 
GAGTTTTTTA 
GTAAACTTAG 
AGCGAGCAAA 
AACCTGATAT 
AAATTAATGA 
ATCTTATACA 
AAACACCGAT 
AGCTTGAAGC 
AAAAAGCTGA 
GAAACACCGG 
TTATGCCTGA 
TCGTCCTAAC 
CTGCTGAACG 
ACGAAAGAAC 
TTGTTGCTGG 
AAAATTCTAA 
AAAAACCTGG 
ATACTAAAGC 



6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
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CTATGATGGT ATCGTTCGTG TAACATCAGA TGACGCTCTT CCACTCGGAC 

GCACTCGGAC GTGAAATTGG 
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8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
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8700 
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8940 
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TTATTCAAGG CTGCTGCCAT TGTAGCTGCA ACTTCAGCTT CGAAGTCGTT 
TCGATACCTT CACCAACTTC AAAGCGAGCA AACTCAACTA CCGAAGCGTT 
AGGTATGCTT CAACTGTCTT GCTGTCATCC ATGATGTAAA CTTGTGCAAG 
GCTTGGTCAA CTTTAGTGTT ATCAAGCATG AAGCGATCCA TTTTACCTGG 
TCCCAGATTT TTTCTGGTTT GCCTTCTGCA GCCAATTCAG CTTTGATGTC 
TGAGCAATAA CATCATCAGT TAATTGAGCT TTTGATCCAT ACTTCAAGTG 
GGTTTATTAA CCATTGCACG GCTTTCGTTG TCTTGGTCGA TAACGTGATT 
AACTCATCTT TAACGAATTG CTCATCCAAT TCTTTGTAAG AAAGAACTGT 
GCTGCGATGT GCATTGACAA TTGTTTAGCA AGTGCTTCGT CTCCACCTTC 
ATAACACCGA TACGTCCACC GTTATGTTGG TATGCTCCAA AGTGTTGTGC 
TCAATCAATG CAAAGCGACG GAATGAGATT TTCTCTCCGA TAGTTGCTGT 
TATGCAGCTT CAAGAGTTTC ACCTGAAGGC ATTATCAAAG CAAGAGCTTC 
GCAGGTTTTC CTTCAGCAAT GACTTTAGCT GTAGTATTTA CCAATTCAAC 
TTTTTTGCAA CGAAGTCAGT TTCAGCGTTT ACTTCAATAA CTGCTGCAAC 
ACATAAACAC CAGTCAAACC TTCTGCAGCA ACACGGTCAG CTTTCTTAGC 
ATACCTTTTT CACGAAGCAA TTCAATCGCT TTTTCGATGT CACCGTCTGT 
GCTTTTTTAG CGTCCATAAC ACCGGCACCA GATTTTTCAC GCAACTCTTT 
GCTGTAATTT CTGCCATTTT AATTCTCCTA TATTTTTTGA AAATAGGAGA 
GCCCCGCCTC CGG 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8411 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



TGCAGCTTTC 
AACTGATTCA 
AAGTGTGTAA 
AATAATTTTG 
AGCTTCAGCT 
TGGAAGAGCT 
CAATTGTGCC 
TGGTTTCATC 
AACAACTGAA 
GTCTGTTTTT 
TGCAGATACG 
TTCGTTGTTA 
GAATTGAGCG 
ATTACCGTTA 
TGCCTTAGCC 
TTCTACAAGC 
TACAAGTTTA 
GCGCGGCTAA 



9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
10993 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CGACGGGGAG GTTTGGCACC TCGATGTCGG CTCGTCGCAT CCTGGGGCTG TAGTCGGTCC 
CAAGGGTTGG GCTGTTCGCC CATTAAAGCG GCACGCGAGC TGGGTTCAGA ACGTCGTGAG 
ACAGTTCGGT CCCTATCCGT CGCGGGCGTA GGAAATTTGA GAGGATCTGC TCCTAGTACG 
AGAGGACCAG AGTGGACTTA CCGCTGGTGT ACCAGTTGTC TTGCCAAAGG CATCGCTGGG 



60 
120 
180 
240 
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TAGCTATGTA GGGAAGGGAT AAACGCTGAA AGCATCTAAG TGTGAAACCC ACCTCAAGAT 300 

GAGATTTCCC ATGATTATAT ATCAGTAAGA GCCCTGAGAG ATGATCAGGT AGATAGGTTA 360 

GAAGTGGAAG TGTGGCGACA CATGTAGCGG ACTAATACTA ATAGCTCGAG GACTTATCCA 420 

AAGTAACTGA GAATATGAAA GCGAACGGTT TTCTTAAATT GAATAGATAT TCAATTTTGA 480 

GTAGGTATTA CTCAGAGTTA AGTGACGATA GCCTAGGAGA TACACCTGTA CCCATGCCGA 540 

ACACAGAAGT TAAGCCCTAG AACGCCGGAA GTAGTTGGGG GTTGCCCCCT GTGAGATAGG 600 

GAAGTCGCTT AGCTTTAATC CGCCATAGCT CAGTTGGTAG TAGCGCATGA CTGTTAATCA 660 

TGATGTCGTA GGTTCGAGTC CTACTGGCGG AGTAATtGAT AAAAGGGaAC ACAGCTGTGT 720 

TCCTCTTTTT GTATCAATTT GTATCACCAA GCATTTTCAT AAGGAAGTCT GTTATTTCTT 780 

GAGAACTTTC TTTTTTTCCA TGTGCAATCC AAGTTTGGCA GACACCAAAA AGTGCATGAG 840 

TTAGATAGAT GCTACTATAT TCTAATTCAG TGGTATTTAG ATTCAGTTGC ATAAATCGCT 900 

TTTGTAAATC TGTACTAAGC ATGATATGAA GTTTATTTCG TAAGAAATTT TGGATTTCTT 960 

TAGTCCCATT TTCAGAAAGA AGGGCAGCCA GAAGTGGTTC TGACTCTAGA TATTCAAAAA 1020 

CTTCTAAAAT AGCGTCTCTT TTGTGATGAG CATGTTTTTG AAAAATATAT TCAAATGTAT 1080 

GGAATAGCTT GCTTTGATAG TGCTCAATCA TATCATACTT ATCCTTATAG TGAGTATAGA 1140 

AGCTGGAACG ACTAATTCCG GCTTTTTCTA CTAATTTGAC AGTAGAAATT TTATCAAATG 1200 

GCTGTTCCAT CAGTAATTGT ACCATAGCAT TTTCAATAGT TCGCTTTGTT TTTAAGCGTT 1260 

TGTTACTTTC TTGCATATTT CCTCCTTGTA AACAAATTAG ACTATATGTC TAAAAATAGA 1320 

TTTTTTATCT TGTAATTTAG ATTTTTTAAT GTATAATCTA TTATATCAAA ATTTTAGACA 13 80 

ATATGTTTAA AAAAGGAGAA ACTAAGTTTA AAGAATGGAA AGCAATTTAA AAAAAACCAA 1440 

CCTTTATTAT TGTCATGATC GGGATTTCTC TTATTCCAGA TCTGTACAAT ATCATATTTT 1500 

TGTCATCAAT GTGGGATCCA TATGGGCAAT TGTCTGACTT ACCTGTGGCA GTTGTAAATA 1560 

ATGATAAAGA GGCTTCCTAT AATGGTAATA CTATGGCAAT AGGAAAAGAC ATGGTGTCCA 1620 

ATTTAAAAGA AAATAAAACC TTGGATTTTC ATTTTGTAGA TGAAGAGGAA GGAAAGAAGG 1680 

GATTGGAAGA TGGCGATTAC TATATGGTAG TGACTTTACC AAGTGATTTA TCTGAAAAAA 1740 

CAACTACATT ATCCAATATT CAATCGACAG CAGCTTATCA ATCATTGACA AGTGAGCAAC 1800 

AAACTGAGAT AAGTGATTCT GTATCTCAAA ATTCAACTGA TAGTATTCAA TCGGCTCAGT 1860 

CAATTGTAGC TTTAGTACAA GATTTACAGG GAAGTTTAGA AAACTTACAA AATCAATCTT 1920 

CTAATCTTTC GACTTTAAAA AATCAATCTA ATCAAGTATC ACCTATTACT TCTACTTCTT 1980 

TGATAGGATT GTCAAGTGGA TTAACAGAGA TACAAGGAGA TGTTACTAGC AAATTAGTTC 2040 
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CTGCCAGTCA 
CTCAGGGCGC 
TAGTTTCAGG 
AATTACAATC 
CTCCATTAGA 
GAACAAAGTT 
GACTAGGTAA 
TTTTGTCAAA 
TCGCAATAGC 
GATATTTGCG 
TTGAGCTGAA 
TCAGCTTATT 
AAGTTTAGTA 
TTTTTTCTCA 
TGCTTTGACA 
TTCGGGATTA 
CTATTTACTA 
AATCGACCGA 
ATTCCAAATA 
ATGAAGGAAA 
TGAGCTTGTT 
GGAATGGTGA 
TTAGAAGTTG 
TTGGTATCTA 
CTAACATTGA 
ACATTATCAG 
TTATCGTTAG 
TCAATTTGAT 
TTAGTATTTT 



GTCGATTGCA 
AAGTCAACTA 
CTCAAACACC 
AGGATCTGGG 
GAATAGAGCT 
AACTTCTGGA 
TGCTAGTGAT 
TCCACTCAAT 
TCCTTATATG 
AAATTGCCTT 
ATAAATGGTA 
GGTTTAACTG 
TTCATGTCTA 
CTTATTTTGC 
AATGATTTCT 
CGACAAACAA 
GTTTAGGTAT 
TTAACTGGTC 
GTAAGAGAGA 
GTGTAATAGT 
TAATTTCTAT 
AGAAAATAGA 
ATACTGTGAC 
AACGATAGAA 
TATTGAAAGT 
TTGTAAAAGT 
AAAAGTGAAG 
ACTGGATTGC 
CTATTTTTTC 



TCAGGTGTAA 
AGTGAAAAAA 
TTGACACAAA 
CAATTAGCAG 
AATAAATTGG 
TTGGAAGATT 
CAACTCAAAT 
CTTTCAAAAA 
ATATCAGTTG 
CAGGACGTCA 
TTATAGCTGT 
CTAATCATGA 
TGGTGACCAC 
TTTTACTACA 
TTAGATCTAT 
TCTCTATCAA 
GCTAGCCTAT 
GATTTTTTAT 
AGTAAAGGAA 
TCCTTTCCCC 
TTTCTTACCA 
TGTATCTTGT 
AGGTTGTTCT 
GGTAGGAGAT 
TTTTTTCTCT 
CTTTTTTTCA 
GTTTATGAAA 
TGCATCTTCA 
TGTTTTTTCA 
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ACGCATATAC 
ATGCCACCTT 
AATCTTCTAG 
ACAAATCCAG 
CAGATGGATC 
TACAGACAGG 
CAGTATCAAC 
CAGACAATGA 
CTCTTTTTTT 
TCCAGAGAGC 
TTTGGCAGGA 
GATGAGAATA 
TTTAGCAACG 
GTTAGCATCA 
TAATCCCTGG 
CAAGTCATTT 
CAACATAAGA 
GCCTTAGATG 
CAGATTGCTC 
TTGGGAATGT 
TCTTGGTAGG 
TGGACATCAT 
TTAATTTTTT 
TCAAATGATA 
TTAGTATATC 
CCATTTACAA 
GAGAGATAAA 
TTTGAAGAAC 
TAAGGTATTG 



TACAGGTGTT 
GACAGGTAGT 
ATTGACAGCA 
TCAGTTACTT 
TGGGAAACTA 
ACTTGCTTCT 
AGAATCTAAA 
TCAAGTTCCT 
GCAGCAATAT 
CGTTGGGCTT 
ATTTTGGTAT 
TTTATTCTCA 
TGGAATAGCC 
AGTGCAGGTA 
TTACCAATGA 
TCCTAGCTGT 
AAATGGAAGA 
ACTTTCGTCT 
CAGTAATAAA 
CAACTTTCAT 
CAGACCAACC 
ATGTAGCAAA 
GAATTGCCTC 
CTTGTGAATT 
C TAG ATT AAA 
GGATGTCAAC 
CTTGGCTGTT 
TTGTGACACT 
GAGAAAAATA 



GATAAAGTTT 
TTGGATAAAC 
GGAGTTGGTT 
TCAGGTGCTT 
GCAGAAGGTG 
TTAGGACAAG 
AATGCAGAGA 
GTAAATGGAA 
CAACAAATAT 
GGTTGAAATC 
ATGGAGGAGT 
TCATCCTAAC 
GTATAGGAGC 
CTTATCCACT 
GCTATTCAGT 
CATACTAGTT 
AGATTAAAAA 
GTGATTATAG 
ACCATTGGGA 
AAATCCAGTT 
TTTGTCATAA 
AACCTTGTTT 
GGTGAAAGTT 
TCCAGGGAAA 
GAAGGAGAAG 
CTTCTTTTGT 
TTCTGGAACT 
AATCAAATCA 
ATCAAAATTG 



2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
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ACGTTAGCAA GTTGATTTAA AAATGAGGCC TGATTATCCA AGGTATGTTC ATTGAACTTG 3840 

ACATCATTGT AAACAGATTG ACTCGCAACT GCAATCGGAA GAGAGTATTG ATTTTCATAT 3900 

AGGGTAAGAT TATCTTTTTG AT AG AT AT CT TTAAAGCCAT ACTTATCAAT AGGACTGTCT 3960 

GAGATATTGT ACTGGATACC AAATAAACTA TCAGCCAAAA TACTATTATT TGCATATCGG 4020 

AGATTGAGAT TAGTCCCAGA GGATTTAAAA CCAAGTTTAT CTAAAGTAGA GCTTGATGAA 4080 

CGATTTCGAA CAGATGAAAA TTGAGAGATT CCATTGTAGT TGAATTTCAT ACTGTCATTT 4140 

CCTGTCTGAG TTTGTAGTTT TTCAGTACGA GTAAATTGAT TTCCAATATA TGTTGAGAAA 4200 

GATTCCATAG CTGGGATATC TCGACTATAA GCACTTCGAG AAGCAAATCC CCATTCCTTA 4260 

GCAATTCCGT CCATTTGAGA TGAAGCATTT AAACTCATTT CAACCAGTAT AAATAAAGAG 4320 

ATTAGAATGG CAAATAGATT CACAGATATA AACTTTTTGA TAACTGCAAG GAGTAAAAGA 4380 

GAATAGACAA CCAAAAATTC AAGAGTAAGC AGAATATTCA AATCTGTTAA AAAAGAATAA 4440 

TGCGATTTTA GATAGATGGT AGCTAAAAAT CCTGCTACTA CAAGAAAAAG CGAAACTAAA 4500 

AAATTCCAGA CTTTAAGTTC TTTCAGACGC TTTAAGACTT CTGCTGCTGT GTAAATTAAC 4560 

AAGGTAGAGA AAATCCAAGC ATAGCGATGT AAAAACATGT TTGGAGTATG CATGCCTTGC 4620 

CAAAATAAGT CAAGAGCTTC TATGTAAAAG CTTGCAATTA GAAATGCAAA GAATATTACA 4680 

TATATGAGTT TCACGTGAAA CTTAATAGAT TTCAGCGTAA AAAATAAAAT GGTCAAAATA 4740 

AAGGGAAATA GTCCAACAAA AATCATTGGG ATGGCCCCAT ACTTTGTTGT GTCAAAGGAA 4800 

CCAATGAATT GCTTAGCAAA GAGATCAAGA TACCAGCTAC TTTCAGTTTG AAACTTTGTA 4860 

ACTTCAGTCA ATTTTTCCCC ATGTGTCTGT AAATCAAATA GAGTGGGAAG AGTCATAATC 4920 

AAACTAGCCA TACCAGCTAA AAAGGAGATA ACTATGAAAT CAAGAACAGA TGATTTTCGA 4980 

GTCTTAAAGT CCCACGAAAT TTGACAGAGA TACCAGAAAA TAAGAAACAA TACTGTCATA 5040 

TATCCAAAAT AATAATTTTG AATAAATAAG ATTGACAGAC TTGTAAAGTA CAATAGGAGT 5100 

TTCTTTTCAG TTATCAGTAG ATGTAAACCA GTTATAATTA AAGGAATCAA GATAAAAACA 5160 

TCTAGCCAGG TTTTTATCTC TAATTGACTG ACAGTGAAAC TCATCAGAGC ATAGGAAGTA 5220 

GATAAGGCTA GTTTTAAAAT CTGAGGGATA GATTGAAACA ATTTATTCAA ACTAAAAAAG 5280 

GTTGACAGAC CAATCAATCC AAATTTTAAG AGAGTTGTCA GATAGATAGC ATCTGGCATA 5340 

TTCGTTAGAT CAAAAAAGTA AACCAGAGGC GCGAGAAAAC TACCCAAGTA ATAACTAGAT 5400 

AGGGCATAGA AGTTTAGCCC TAGACCACTT GTAAAGGTGT AAAACAGATT ACTATTTCCA 5460 

TGTAGGATAT TTCGTAAGGC TACATCAAAA ATAACGTATT GATGAAAGCC ATCTCCTAAT 5520 

AGAGGAGAGT TGTCGCTATT CCAGTAGATA CTTTGAGATA GATATACTCC AGACATAATC 5580 
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ACGTTAGCAA 
ACATCATTGT 
AGGGTAAGAT 
GAGATATTGT 
AGATTGAGAT 
CGATTTCGAA 
CCTGTCTGAG 
GATTCCATAG 
GCAATTCCGT 
ATTAGAATGG 
GAATAGACAA 
TGCGATTTTA 
AAATTCCAGA 
AAGGTAGAGA 
CAAAATAAGT 
TATATGAGTT 
AAGGGAAATA 
CCAATGAATT 
ACTTCAGTCA 
AAACTAGCCA 
GTCTTAAAGT 
TATCCAAAAT 
TTCTTTTCAG 
TCTAGCCAGG 
GATAAGGCTA 
GTTGACAGAC 
TTCGTTAGAT 
AGGGCATAGA 
TGTAGGATAT 
AGAGGAGAGT 



GTTGATTTAA 
AAACAGATTG 
TATCTTTTTG 
ACTGGATACC 
TAGTCCCAGA 
CAGATGAAAA 
TTTGTAGTTT 
CTGGGATATC 
CCATTTGAGA 
CAAATAGATT 
CCAAAAATTC 
GATAGATGGT 
CTTTAAGTTC 
AAATCCAAGC 
CAAGAGCTTC 
TCACGTGAAA 
GTCCAACAAA 
GCTTAGCAAA 
ATTTTTCCCC 
TACCAGCTAA 
CCCACGAAAT 
AATAATTTTG 
TTATCAGTAG 
TTTTTATCTC 
GTTTTAAAAT 
CAATCAATCC 
CAAAAAAGTA 
AGTTTAGCCC 
TTCGTAAGGC 
TGTCGCTATT 



AAATGAGGCC 
ACTCGCAACT 
ATAGATATCT 
AAATAAACTA 
GGATTTAAAA 
TTGAGAGATT 
TTCAGTACGA 
TCGACTATAA 
TGAAGCATTT 
CACAGATATA 
AAGAGTAAGC 
AGCTAAAAAT 
TTTCAGACGC 
ATAGCGATGT 
TATGTAAAAG 
CTTAATAGAT 
AATCATTGGG 
GAGATCAAGA 
ATGTGTCTGT 
AAAGGAGATA 
TTGACAGAGA 
AATAAATAAG 
ATGTAAACCA 
TAATTGACTG 
CTGAGGGATA 
AAATTTTAAG 
AACCAGAGGC 
TAGACCACTT 
TACATCAAAA 
CCAGTAGATA 



242 
TGATTATCCA 



GCAATCGGAA 
TTAAAGCCAT 
TCAGCCAAAA 
CCAAGTTTAT 
CCATTGTAGT 
GTAAATTGAT 
GCACTTCGAG 
AAACTCATTT 
AACTTTTTGA 
AGAATATTCA 
CCTGCTACTA 
TTTAAGACTT 
AAAAACATGT 
CTTGCAATTA 
TTCAGCGTAA 
ATGGCCCCAT 
TACCAGCTAC 
AAATCAAATA 
ACTATGAAAT 
TACCAGAAAA 
ATTGACAGAC 
GTTATAATTA 
ACAGTGAAAC 
GATTGAAACA 
AGAGTTGTCA 
GCGAGAAAAC 
GTAAAGGTGT 
ATAACGTATT 
CTTTGAGATA 



AGGTATGTTC 
GAGAGTATTG 
ACTTATCAAT 
TACTATTATT 
CTAAAGTAGA 
TGAATTTCAT 
TTCCAATATA 
AAGCAAATCC 
CAACCAGTAT 
TAACTGCAAG 
AATCTGTTAA 
CAAGAAAAAG 
CTGCTGCTGT 
TTGGAGTATG 
GAAATGCAAA 
AAAATAAAAT 
ACTTTGTTGT 
TTTCAGTTTG 
GAGTGGGAAG 
CAAGAACAGA 
TAAGAAACAA 
TTGTAAAGTA 
AAGGAATCAA 
TCATCAGAGC 
ATTTATTCAA 
GATAGATAGC 
TACCCAAGTA 
AAAACAGATT 
GATGAAAGCC 
GATATACTCC 



ATTGAACTTG 
ATTTTCATAT 
AGGACTGTCT 
TGCATATCGG 
GCTTGATGAA 
ACTGTCATTT 
TGTTGAGAAA 
CCATTCCTTA 
AAATAAAGAG 
GAGTAAAAGA 
AAAAGAATAA 
CGAAACTAAA 
GTAAATTAAC 
CATGCCTTGC 
GAATATTACA 
GGTCAAAATA 
GTCAAAGGAA 
AAACTTTGTA 
AGTCATAATC 
TGATTTTCGA 
TACTGTCATA 
CAATAGGAGT 
GATAAAAACA 
ATAGGAAGTA 
ACTAAAAAAG 
ATCTGGCATA 
ATAACTAGAT 
ACTATTTCCA 
ATCTCCTAAT 
AGACATAATC 



3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 
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ACTACAGGAA 


TGATGAAAGA 


AATAAAATAG 


GTTCGATATG 


TTTTTAAAAA 


TGATTTCATG 


5640 


TTACCTCGTA 


GAATGATAGA 


AAACTCAGTT 


GGTTAACCCA 


ACTGAGTTTT GAAGTTTTAT 


5700 


TTAGTCTTTC 


CAAAGTTCTT 


TAACTTTTGC 


TTGTACTTCT 


GCATTTTCTA 


GGAATTCATC 


5760 


GTAGGTTTCA 


TCGATACGGT 


CAATGACGCC 


ATTTTTAGAT 


AAGACAATGA 


TATGGTTAGC 


5820 


CAAAGTTTGA 


ATAAATTCGT 


GGTCATGGCT 


GGCAAAGATG 


ATTGATTCTT 


TAAAGTTTTT 


5880 


CAATCCATCA 


TTCAAGCTTG 


AGATAGATTC 


CAAGTCCAAG 


TGATTTGTTG 


GATCATCAAG 


5940 


TACAAGGACA 


TTTGATTTTA 


AGAGCATGAG 


TTTTGAAAGC 


ATGACACGAA 


CTTTTTCTCC 


6000 


CCCTGACAAG 


ACATTTACAG 


GTTTGTTAAC 


TTCATCTCCA 


GAGAAGAGCA 


TACGGCCGAG 


6060 


GAAGCCACGT 


AGGAAAGTAT 


TGTCATCTTC 


TTCTTTACTT 


GCGAATTGAC 


GCAACCAGTC 


6120 


AAGAATTGAT 


TCTCCTCCTG 


CAAAATCAGC 


TGAGTTATCT 


TTTGGTAGGT 


AAGATTGACT 


6180 


AGTTGTAACT 


CCCCACTTGA 


CAGTTCCTTC 


ATAGTCAATA 


TCTCCCATGA 


TTGCACGAAT 


6240 


TAATGCAGTC 


GTTTGAATAT 


CATTTTGTCC 


AATAAGTGCT 


GTCTTATCAT 


CTGGACGCAA 


6300 


GATGAAACTA 


ATATTATCCA 


AGATAGTTTC 


ACCATCAATC 


TTTACAGTTA 


AATTTTCTAC 


6360 


TGTCAAGAGA 


TCATTACCAA 


TCTCACGTTC 


CGCTTTAAAG 


TTGATAAATG 


GATATTTACG 


6420 


ACTAGATGGC 


ACAATCTCTT 


CTAGCTCAAT 


CTTATCAAGC 


ATTCTCTTAC 


GTGATGTTGC 


6480 


CTGCCTTGAC 


TTAGAAGCAT 


TGGCAGAGAA 


ACGAGCAACA 


AATTCTTGCA 


ATTGTTTAAT 


6540 


TTTTTCTTCT 


GCTTTAGCAT 


TACGGTCTGC 


TAGCAATTTA 


GCAGCAAGCT 


CAGAAGATTC 


6600 


CTTCCAGAAG 


TCGTAGTTTC 


CGACATAGAG 


TTTGATTTTT 


CCAAAGTCAA 


GGTCGGCCAT 


6660 


GTGAGTACAA ACTTTGTTTA AGAAGTGACG 


GTCGTGGGAT 


ACTACGATAA 


CTGTGTTATC 


6720 


AAAGTCAATC 


AAGAAGTCTT 


CTAACCAAGT 


AATCGATTGG 


ATATCCAAAC 


CGTTAGTAGG 


6780 


CTCGTCCAAG 


AGAAGAACAT 


CTGGTTTACC 


AAAAAGTGCT 


TTGGCGAGGA 


GAACCTTTAC 


6840 


TTTTTCACCG 


TTGGCCAATT 


CGCTCATGTT 


TTGGTAGTGT 


AATTCTTCTG 


GAATGTTTAG 


6900 


GTTTTGAAGT 


AGTTGAGAGG 


CTTCACTCTC 


TGCTTCCCAA 


CCTCCAAGTT 


CGGCAAACTC 


6960 


TCCTTCGAGT 


TCGGCAGCAC 


GAACCCCGTC 


CTCGTCTGAG 


AAATCTTCCT 


TCATGTAGAT 


7020 


AGCATCTTTC 


TCTTTCATGA 


TGCTATAAAG 


TTTTTCATTT 


CCCATGATAA 


CGACATCAAT 


7080 


GGCACGTTCA 


TCTTCGTAGT 


CAAAGTGATT 


TTGACGAAGA 


ACAGAGAGAC 


GTTCATCTGG * 


7140 


ACCAAGAGAG 


ATGTGACCAG 


TAGTAGGTTC 


GATATCTCCA 


GCTAAAATTT 


TTAAAAAGGT 


7200 


TGATTTTCCG 


GCACCATTAG 


CACCGATTAA 


TCCGTAAGTA 


TTTCCTTCTG 


TAAATTTGAT 


7260 


ATTGACATCA 


TCAAAAAGTT 


TGCGATCACT 


AAAACGTAGT 


GAAACATCAG 


ATACTGTAAG 


7320 
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CAATGTTTTT CTCCTATATG TGTAATATAT TTATTCTACT AGAAAATACA GAAATATTCA 7 3 80 

AATTTTTATT TGTCAATTTT GTGTAAATTA TATTTACAGT ATCCTTTACA CAAATCTGTA 7440 

AAAAGCAAGG CTGATTTATT TTGATAAATT ACGGTTATTT CATTAAAAAA ATGCTATAAT 7500 

TGAAAGGACT ATATCGAAGG AGAACAAAAT GACTAAACCC ATTATTTTAA CAGGAGACCG 7560 

TCCAACAGGA AAATTGCATA TTGGACATTA TGTTGGAAGT CTCAAAAATC GAGTATTATT 7620 

ACAGGAAGAG GATAAGTATG ATATGTTTGT GTTCTTGGCT GACCAACAAG CCTTGACAGA 7 680 

TCATGCCAAA GATCCTCAAA CCATTGTAGA GTCTATCGGA AATGTGGCTT TGGATTATCT 7740 

TGCAGTTGGA TTGGATCCAA ATAAGTCAAC TATTTTTATT CAAAGCCAGA TTCCAGAGTT 7800 

GGCTGAGTTG TCTATGTATT ATATGAATCT AGTTTCGTTA GCACGTTTGG AGCGAAATCC 7860 

AACAGTCAAG ACAGAGATTT CTCAGAAAGG ATTTGGAGAA AGCATTCCGA CAGGATTCTT 7920 

GGTCTATCCA ATCGCTCAAG CAGCTGATAT CACAGCTTTC AAGGCTAATT ATGTTCCTGT 7980 

TGGGACAGAT CAGAAACCAA TGATTGAGCA AACTCGTGAA ATTGTTCGTT CTTTTAACAA 8040 

TGCATATAAC TGTGATGTCT TGGTAGAGCC GGAAGGTATT TATCCAGAAA ATGAGAGAGC 8100 

AGGGCGTTTG CCTGGTTTAG ATGGAAATGC TAAAATGTCT AAATCACTAA ATAATGGTAT 8160 

TTATTTAGCT GATGATGCGG ATACTTTGCG TAAAAAAGTA ATGAGTATGT ATACAGATCC 8220 

AGATCATATC CGCGTTGAGG ATCCAGGTAA GATTGAGGGA AATATGGTTT TCCATTATCT 8280 

AGATGTTTTT GGTCGTCCAG AAGATGCTCA AGAAATTGCT GATATGAAAG AACGTTATCA 8340 

ACGAGGTGGT CTTGGTGATG TGAAGACCAA GCGTTATCTA CTTGAAATAT TAGAACGTGA 8400 

ACTGGGTCCG G 8411 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9064 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

TGCCGTACTC AAGTACAGCC TGCGCTAAGT TTCCTAGTTT GCTCTTTGAT TTTCATTGAG 60 

TATTAGTAAC CAAAATCCGA CCACATAGCC AGCCCCTATG AATATAGCCA TTAAAGCTAG 120 

CATGGAATTT AGGAAATTAA AAACCACCGC AGATACAAAG GTTAGCACAA AAACATTAAA 180 

AGCAATGGTG TCAGAAGCCA AGACTAGAAT ATAGGGTGTC AACCGATCTA AAGTTTTGGA 240 

ATCTAGGAAA AATAAGTGTT TATACATGAT GACCTCCTCT ATGGCTGAAA AGCAAGCCTT 300 
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TTGTTTTTTT 
ATATTATTGA 
CCAGCAAAGA 
GAAAAATGAG 
TGCTTAAAGA 
' AGAAAGAACA 
AATACAGCCC 
ATCTGGAACA 
ATGCGGAAGA 
AGTAAACTCA 
TAGTTTTGGA 
ACTGCACTTA 
TCACCTTGTC 
ACCTGACTAC 
AATAAAATCA 
AGAATTTGAA 
TCCTTGATTT 
CCACTGTAAA 
AGAATATCCA 
ACCTCCATTC 
GAAAAGGATA 
AAACAAGCAG 
ATACCTCTAT 
TTGGTTCTAG 
AATGTATGTT 
GGTTAAGTCT 
CGAAACTATC 
AATCAAGAGC 
ATATCATTGT 



ACCCCAAGAC 
TCACATGCAC 
TGATTCCAAC 
GGAGAGCAAA 
AAGCATGTTG 
GGGCTATATA 
AACCTTCCGC 
CTAGCACTAA 
GATAACCATG 
AGATATTTTG 
CGATAAGCGT 
TTTTGAATAG 
AGGCTCTACT 
TAGATAATAG 
ACCTCGCATC 
ACCATAAGGT 
TTACCGCCAC 
GAACAAGCCA 
ACACACTACT 
ATTTATTTCA 
GAAAGCTACT 
AGAATACACC 
ACAAACAAAT 
GACTAACCAA 
AGCACTGAAA 
CTAAAAAAAT 
TTTTTCTTAT 
GATTTTTAAC 
TTTTTAAAAT 



CCTATGTAGA 
CGCATAGGAT 
TGTTGCAAAG 
TAAAATAGAA 
CAGTAATCCT 
AATACCTAGC 
AGTTGACTGA 
TACTGTCAAA 
GCCTGTCTTA 
AATCCAGAAT 
CAGCTGAGAA 
AAGTTGATAC 
GCTGTAAGAr 
ATACATTAAG 
CAAACCAAGA 
TTTTCCAAAA 
CCCTTTATTA 
CCCAATAGAT 
CAAGAAAATA 
CTAACAATTT 
TTTTATAATA 
TATATAAGCG 
GACAAACATA 
ATCATCATTT 
AGCAAGACAG 
TATCTACTGA 
CCATAATTAT 
ATAATGTAGC 
TTTTCATCCA 



245 
AAAGTGAGCA 
GGATAAATGC 
ACGAAGATAT 
GGAAGAAGCA 
CTATAAATCA 
TCTGCAAAGT 
ACATGTTTAG 
ATCGAATACC 
ACAAGAACCA 
AAATTGCCTA 

agactaaata 
tttttcatag 
taagaagaca 
gcattaaaga 

TAAAGTTTGA 
ATAAATTTAA 
GCAAGAAGGA 
ACGATAGAGA 
ACAAAAAATA 
AATAGAGCCT 
CTTCAAGCCC 
ATTAGTTGTT 
AAATCTGCCA 
ACTTATATTT 
GCCAATAATA 
CACTACAAGA 
TTACTCCTTT 
AGCACCCGTT 
AATCTTGAAT 



AAAACGGGAA 
TCTTGGTATA 
CTAACAGACT 
AATCAAGACC 
ATTCTTCCAT 
TAGTCCCACT 
CTGTCTGAAC 
AAAGCCATTT 
CAATCATGAC 
TCTGAGAAGA 
CGAAAAATAA 
AAATCCTCCC 
GTTTGTTTTT 
CAATGAAAAT 
TTATCAAAAA 
AGCGATTTCG 
AAACTCCTGC 
TTTGTAAAAA 
ATCTGTATTT 
TCTACTCAAA 
CACATGAGCA 
GATAGAATTC 
AGCCGATAAA 
AAGAGTATCT 
TTTAAAATGA 
AATACTATAC 
CCTAACAAAT 
GCAACTTTGA 
TGTCATCGAA 



GGTCGCTACA 
GCGGGTCAAA 
AGGCAGGCTT 
AAATCGCGAA 
CAGTGGAACC 
ATAACCAATC 
GTTAAAAGAG 
TTTTCTTGGA 
TCCAATAAAA 
AAATTGCCAA 
GTAAGAGAAG 
TACTATGACC 
TTTAAGGCTA 
ATGTCCATAG 
GATGAGCAAA 
AATATCTACT 
TTCAAACAAA 
TGTCCCTAAA 
CATATTAAAT 
TATCCTGTCA 
GAAGCGTGAT 
TGTTTCTGAA 
CATAAGTTGA 
CTTTTATTTT 
ACAGTAACGG 
ATATTATAGT 
CCAGCTTATC 
CAAGTTTAGT 
ACATCTTGAA 



360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
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TTGTTAAAAA ATTTAAAAAG TAAGCATTAA AAACATACTT TCCTCTTTAT ATTGTATTGA 2100 

TACCAACTTG TTTGTAGACT TTTCATCCTG CTATCACATA TCATTTTGAC AGGCGAAACA 2160 

ATATTAAAGA AACTCCCCTG TAAATTAAGC TAGCAAATAC AGGGGAGAAA TTTATTTTTT 2220 

AGAGAGTACT ATCCGTATCC TTTTTGGAAG ATTTTGAAAA TATTTTTCTA ATTAAGTCAT 2280 

CCATATAAGG ACCAAATATA CCAACTACTA AACCAATAAT AAAACTTTTA AAATCCATAA 2340 

TTACCACCAA CATATTGCTG CATAGGCTAC ACCTCCAAGT ATAGCTCCAC CTGCAGCACC 2400 

AGTTACACCT ATTCCTATAG CAAATGGTCC CAATAGAAAT GTCAAACCGT TGTTGCACAC 2460 

CCATCAATTG CGCCATATGC AACCCCTGCT GCACAACTAA TTTTTCTTCC CCAATCAATA 2520 

TCTCCACCTT CAACGCAAGC AAGCATTTCA TTATCCATAA CTGCAAATTG TGACATCATT 2580 

TTTGTATCCA TATAGTGTAT CACTTTTCAG TTACGGAACA AGTTTAATAT AAAAATTATC 2640 

AAAAAAACAT AGGCAATAAA GAGAAAAATT AATTTATCAT AGATTAGAAA TAATATGACA 2700 

AAACAATTCA ATGATGTTAA TTCAATAGTC TTTTGTTTTT TATCGGAGAT ACTTATGGAT 2760 

AGATAAATAA GATAGGTTTG AAAAGCGAAG AGAATAATAA AGAATATAGC CTTCATAAAA 2820 

TTTAGCTTTC ATTTTTATGA TGTAGCGGTA TAGGCTAAAT ATCCACAAAC CACTGCTCCT 2880 

CCAATTCCTC CTATTGCAGC GCCCCATGGT CCTAGAAGTC TCCCATATTT CACTCCACCC 2940 

GCTGCACAAC CTAAAGCAGC AACTACAGCT GCTCCTCCGG AATTACCTCC ATAAACCTCA 3000 

CTCAGCATTG TTTCATTTAT ATTACAATAA GTATTCATAC AAGTCTCCTT TTATTAAAAT 3060 

CCACCCGTTG CCCCTGTTAC TCCTGCCCAA AGATCCACAC CAAATTTAGC TCCTATGTAT 3120 

CCACATGCTC CCATAAATGG TGCTCCAACA CCACTCGCAG CACAAATAGC TGTCCCTAGC 3180 

CCCCAGCCAC CAAAAGCAGC ACCACCACCT TCTAAGACAT TAGTTTGCCA ATTATTCTTG 3240 

CCTCCTTCAA TACTAGATAA CATAGTTATA TCCATTTCAT GAAATTGTTC CATAATTTTT 3300 

GTATCCATGA CAAATACTCT TTTTTATTTT TAATTTTTGT CTTGTTGTAA CTTTGACAAG 3360 

TTTAGTATAT CATCGTTTTT TAAAATTTTT CATCCAGATT TTGAATAGTC ATCGAAACGT 3420 

CTTGAATTGC AAAAATTACA TTAGACTTCC TGCAAAACTA GAATCCTAGT TCATGATTGA 3480 

TAATACCAGC ACTCAAATTC ATTCGTAATC CGAAGCGTTT ACGATGACTT CGATAGGTTG 3540 

TTGAAAACAT TTTAAACGTT TTTACTTTGG CAAAGATGTT CTCAACCTTG CTTCTCTCCT 3600 

TAGATAGCGC ATGGTTACAG GCTTTATCTT CAACTGTTAG CGGTTTGAGT TTGCTGGATT 3660 

TACGTGAAGT TTGTGCTTGA GGATATATCT TCATGAGCCC TTGATAACCA CTGTCAGCCA 3720 

AGATTTTACC AGCTTGTCCG ATATTTCTGC GACTCATTTT GAACAACTTC ATATCATGAC 3780 

AATAGTTCAC AGTGATATCC AAAGAAACAA TTCTCCCTTG ACTTGTGACA ATCGCTTGAG 3840 
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TCTTCATAGC GTGAAATTTC TTTTTACCAG AATCATTCGC TAATTCTTTT TTTAGGGCGA 3900 

TTGATTTTTA CTTCCGTCGC ATCAATCATT ACCGTGTCCT CAGAACTGAG AGGAGTTCTT 3960 

GAAATCGTAA CACCACTTTG AACAAGAGTT ACTTCAACCC ATTGGCTCCG ACGGAGTAAG 4020 

TTGCTTTCGT GAACACCAAA ATCAGCCGCA ATTTCTTCAT AAGTGCGGTA TTCTCGCACA 4080 

TATTGAAGAG TGGCCATAAG AAGGTCTTCT AGGCTTAATT TAGGTTTTCG TCCACCTTTT 4140 

GCGTGTTTAA GTTGATAAGC TGTTTTTAAT ACAGCTAGCA TCTCTTCAAA AGTCGTGCGC 4200 

TGAACACCAA CAAGACGCTT AAATCGTGCA TCAGTTAGTT GTTTACTTGC TTCATAATTC 4260 

ATAGAACTAT AGTAAAATGA AATAAGAACA GGATAAATCG ATCAGGACAG TCAAATCGAT 4320 

TTCTAACAAT GTTTTAGAAG TAGAGGCGTA CTATTCTAGT TTCAATCTAC TATACTATAC 4380 

CATATTTTGT TTCGCAGGGA ATCTATTATA AAAGGGTAAG TATTGCAAAA ACACTTACCC 4440 

TTTTCTTTTA TACTTCATTA AGCTCTACTT TTTATAATAC TTCAAGCCCC ACATGAGCAG 4 500 

AAGCATGATG ATTAAGCAGA GAACAGCGCC AATATAAGCG ATTATTTGTT GGTAGGATTC 4560 

TCCTGCTGTG ATACCTCTAT ACAAACAAAT AATAGACATA AAACCTGTCA AGCCGATGAA 4620 

CATAAGTTGA TTGGTTCTAG GACTAACCAA ATCATCATCT TCAAACTCTC TTATCCTCAT 4680 

TTCCCTAGTG AGATAAACAG TAACCAAAAT AGAAGCCAAG TTAATAACTA CTAAAAGAAA 4740 

TTGGAAAACT ACGGAAAAAT TTAAAAACTG ACGAGATAGA AATAGATAAG TAGAAACAAG 4800 

CAAGGGCAAC TGACCTAAGA ACAATCTCGC AAGGAAGATG TTCCGTTTTT TAGCAAGAAA 4860 

AGTTTTCATT TCTTTTCTCC TTTCTTTTTA TTGATAGCAA AATAGATCAT AACTGCAATC 4920 

ACATAGGCTA TGGTATAAAA TAGCTGATAC CAAGCACTCT CCCTAAGCGG ATATAGAAAG 4980 

ATGGACATGA TTAGATACAG AACGAAAATA ATCAGTATTT TTTTCTTCAT AAGATTTCCT 5040 

CCTAAATGTG CGATTTATCT TAGTTGAGCA AGAACATTTA CACTGCTAGT ATAGCACTTA 5100 

TTTTGACCTT GGATCACTCA AATCATAAAT GGTCATCAAA ACCTCTTGAA TTGTAAAAAT 5160 

TAAAAAAGCA AGCATGAAAA ACATACTTTC CTCTTTATAT TGTATTGATA CCAACTTGTT 5220 

TGTAGACTTT TCATCCTGCT ATCACATATC ATTTTGACAG GCGAAACAAT ATTAAAGAAA 5280 

CTCCCCTGTA AATTAAGCTA GCAAATACAG GGGAGAAATT TATTTTTTAG AGAGTACTAT 5340 

CCGTATCCTT TTTGGAAGAT TTTGAAAATA TTTTTCTAAT TAAGTCATCC ATATAAGGAC 5400 

CAAATATACC AACTACTAAA CCAATAATAA AACTTTTAAA ATCCATAATT ACCACCAACA 5460 

TGTTGCTGCA TAGGCTACAC CTCCAAGTAT AGCTCCACCC GCAGCACCAG TTGCTGCACC 5520 

TTGCCATGTT CCTGTTTTAA TGCCTAGTTG AAGACCTCTT GCTGCTCCTC CTCCAACACC 5580 
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TGCTTTGGCA AAATCTCCCC AATTGCATCC GCCACCTTCA ACGCAAGCAA GCATTTCAGT 564 0 

ATCCATAACA GAAAATTGTG ACATCATTTT TGTATCCATG ACAAATACTC CTTTTTTAAA 5700 

AAACTAAAAT AAATCAGAAT AGAATCCTCA TAATTTTACT ATAAGTCTTA CCAACTTAGT 5760 

CCCAATTTAT CACCAACCAT ACCTCCTAAG CATGTTAATC CACCCCCAAT TGCACCAATG 5820 

TGTGCTCCAA CAAATGCACC AGCAAGTCCA GCTACTCCTA AAGTGGCCAA ACCTGCTCCA 5880 

GTTCCACCAG TTATAATTCC CGTAGTGACT CCTGTAATCA GTGCATTTTG ACAATCAGTG 5940 

GAGCTATACC CCCCTTCAAC TTTCGCAAGC ATTTCAGTAT CCATAACCTC TAACTGTGAC 6000 

AACATTTTTG TATTCATGAT GAATACCTCC TTTTTATTTT CAATTTGTTA CCAAAGTCTT 6060 

AAATTCAATA AACAAATAGA TTTTTTATAG TATCTTTTTG ATTTTCTTAA AAAAGTATAT 6120 

ACGTCTACTA TCTTCTTAAA GGTAGCAGTA CCTATTTTTT AGTCTAAGAT TTCAATAATC 6180 

TTGAGTATCT AAAATATCTT AATTTCGTTA TTCTCCTTGC AATAAAAAGT TTTACTATAC 6240 

TATTTATTAA CTTGCAGAAA GCAAAAAATA TTAGTAAATA ATAGTTTATA GTTAAGTTTT 6300 

TTATTCCTAC CAATCCATCA ACTAAGTAAA GCATCAACGA TTACATAAAC GATTGATAAT 6360 

ATAATTAAAA TTTTGCTAAC TATCTTATTC TCATCATTCT TAGATAACTT TGATATTTTG 6420 

TAAGTAAGTA AATAAGACAG TAAATTAATA GCGATAATAA TACTATATTT AAGAATCATA 6480 

ATCTTACAAA GAGGACATAA TTCCTGAACC TACACAAATA AGTGTTGCTG CTCCCCCAGT 6540 

TATCGGACCA GTCGCAGCAG CTAATAGTAC TGCTCCAATA CAACCACCGA TTGCAGATCC 6600 

TAAATTGCCT CTTCCTCCAC TAACTATTTC GAGTTCTTCA TTATCCATAA CAGAAAATTG 6660 

TTCCATCATT TTTGTATTCA TGACAAATAC TCCTTTTTTC TTTTTTTATT TTTGTCTTGT 6720 

TGTAACTTTG ATAAGTTTAG TATATCATCG TTTTTTAAAA TTTTTCATCC AGATCTTGAA 6780 

TTGTCATCGA AACGTCTTGA ATTAGCTTTT TTATTTCAAG CCACCTCTAA ATGTTTAAAA 684 0 

AAAATAATTT CTAATCACTT TTTTACCATT CAGGAAGTTT TAATGACTAT TCAAGATTTC 6900 

ATAAAATATG AACTTAGTTT TATGACATAA TAGACCTATC CACTATATGA AAGGAATTGC 6960 

CAATGACTTC TTATAAACGT ACATTTGTTC CTCAAATAGA TGCGAGAGAC TGTGGTGTCG 7020 

CTGCCTTAGC CTCGATTGCT AAATTCTATG GTTCAGATTT TTCTCTAGCT CACTTGAGAG 7080 

AACTTGCAAA GACCAATAAA GAAGGGACGA CTGCTCTTGG CATTGTAAAA GCCGCTGATG 7140 

AAATGGGCTT TGAAACAAGA CCTGTTCAAG CAGATAAAAC GCTCTTTGAC ATGAGTGATG 7200 

TCCCCTATCC ATTTATCX5TT CACGTTAACA AAGAAGGAAA ACTCCAACAT TACTATGTTG 7260 

TCTATCAAAC AAAGAAAGAC TATCTGATTA TTGGTGATCC TGACCCTTCT GTAAAAATCA 7320 

CTAAAATGTC AAAAGAACGC TTTTTCTATG AATGGACTGG AGTAGCTATT TTTCTAGCTA 7380 
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««« C^CAACCC CATAAAGATA AAAAGAATGG TO A„AAGC 
™* m „ 
«CTATT»,CA ATATAGGTCG TTCTMCTAT CICCA.^., 

AATOao,^, "CCAAGGAA TCT7GOATC, ATACATTO* 

««— »», laOTTCOro 

,rr :r~ ~ ™ —» ~ 

TATTCGCCAT a^a,, 
~. CAGGAGAAAT CATVTCAOGA TTCACAOATG ^ 

™ ™* ~ -™ «— 

«m™, gaOAAACCC TAATCTCTTC „ Amcc , T ^ 

— ~, „„ 
zz: r™™ ~ — — ~ 

«~» »~ TCGCTATCAA GCGAATTTOT AGA„a„g 

«««« TTAAGCTCAG .AAA.A^ a TO «^ 
««« _ „G OTTCCCGCTC AA_ T 
™* GTCAGCTGAT ^.AO a™ c„ ~ 
— • —CCA « „ AGGTCGCTAA „ 
*~ ~ ™ CAAGITCAAG ,„ 

a™^ „ a ^ m _ t 

— — » AGGTTAGCCr AG™»G„ 

z:z ~ *™ — ™ ™ 

_ a„ _ T 

^CCA ACAAGCCTAT A^aA^ 
— . GTAATCATAT GATTAGTCAA GAAGATATTC .AAA^ ^ J 
~= AAGACATTGA „ W ,„ „„ 
GCTCGTCTAT CAGGAGGAC, GAAGCAACGA „CGC^ 

G^TATAGATA ATCTOATCTC TCTAACTGAT AAAACCAT*TC TCTTTGTAGC CCATOGTCPC 
^GTATACCCG AACGAACCAA CCGTOTCATT GWCTTGACC AGGGGAAAAT CATTGAAGPT 
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(2) INFORMATION FOR SEQ ID NO: 18: 

<i> SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 7780 base pairs 
B type: nucleic acid 
<C) STRANDEDNESS: double 
(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID N0: 18 . 

CTCCATTTTT TTGATTTCAT AAATAAACAA CCTCTCTGTT „™ 

l - LI CTCTGTT AATTPTGTAT AATTATAACr 

~ ~ _ A „ _ 

««— — 

™ ~ ™ m _: 
«-« oc^. w „„ cmbmw ^ 

«-«. GcmT „„ 

~* ™**" c °~ — — <~ ™JZ 

™r ~ — ~ ~ — ^ 

hmaMrQ stoct ^ 

= 1TACPTCTGC « „ „ 

— , Tm „ ™ 

„„^ ccomcOT «~ 

*™ — — ™ «_ jz: 

«-» «~ *™ ««« e_ ^ 

°™ ™ — ~ ~ 

—n. <~«,C«C 0 ACAOOCTGCT 
~C ATAGCTSAAT 
— « 

~ ™ ™ — — ~ 

r^" iac 

~c S ™ ra cc _ c 4oai J2 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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TGCGGTTGAG 


AGACTTGAGG 


AGGGTTGACT 


TCCCTGATCC 


AGATGGACCA ATCAAGGCTG 


1500 


TAATTTCCTT 


AGGTTGGAAA 


GATAGGGAAA 


CACTATTCAA 


AGCCTTCTTT 


TTATTATAAT 


1560 


AAACGGACAG 


GTCTGATACC 


TGTAAAATCG 


CATCTGTCAT 


ACGGTTTCCT 


TTCTAACCAA 


1620 


AGTGACCAGA 


TACATAGTCA 


TTGGTGGACT 


GTAGCTTGGC 


ATTTTGGAAA ATAGTTGCAG 


1680 


TCTTGTCATA CTCAATCAAA 


TCACCCAAGT 


AAAAGAAGCC 


TGTATAGTCA 


CTTGCACGAG 


1740 


"CAGCCTGCTG CATATTATGC 


GTTACAATGA 


TGATGGTAAA 


GTTTTTCTTG 


AGCTCAAACA 


180O 


TGGTCTCTTC 


TAGTTGCATG 


GTCGCAATCG 


GATCCAAGGC 


TGAGGCTGGC TCATCCATTA 


1860 


AGAGGATATC 


TGGCTTAACA GAGATGGCAC 


GAGCGATACA 


GAGACGTTGT 


TGCTGACCAC 


1920 


CTGATAAGGT 


CAAGGCTGAC 


TTGTGGAGAT 


CGTCTTTAAC 


CTGATCCCAG 


AGGGCAGCCT 


1980 


GACGAAGGGA 


GGTTTCTACG 


ATTTCATCTA 


GGACTTGCTT 


ATCCTTAACT 


CCAGCACGTT 


2040 


CATGCGCAAA 


GGTAATATTA 


CGGTAAATTG 


ACTTAGCAAA 


TGGATTGGGA 


CGTTGAAAAA 


2100 


CCATTCCAAT 


GTGTTTACGC 


ATTTCATAAA 


CGTTGATTTC 


TGGACGGTTG 


ACATCAATTC 


2160 


CACGATAGAG 


AATCTGCCCA 


GTTACTTTAG 


CAATATCAAT 


AGTATCATTC 


ATGCGATTGA 


2220 


GACTGCGTAA 


GTAGGTAGAT 


TTCCCCGATC 


CCGACGGGCC 


AATCAAAGCT 


GTAATTTTAT 


2280 


TTCTTTCAAA 


TTGCATATCA 


ATCCCCTTAA 


TGGATTCATT 


TTTACCATAG 


TAAACATGGA 


2340 


CATCCTTAGT 


AGAAAGGGCT 


ACTTTTTCTT 


CAGGAAAGGT 


AAGGATATGC 


TTCTCATCCC 


2400 


AGTTATATGT 


TGACATGGCT 


TCTCCTTTAG 


GCAGCGGTTA 


ATTTCTTGTG 


TAGATAGCTT 


2460 


CCGAACTTAC GAGCTCCAAA GTTAAAAATC AGGATAAAGA TCAGGAGCAC 


AGCGGCAGAA 


2520 


CCTGCTGATA 


CAATGGTTCC 


ATCTGGAATA 


GTGCCTTCAC 


TATTGACTTT 


CCAGATATGG 


2580 


ACAGCCAAGG 


TTTCTGCTTG 


ACGGAAGATA 


GAGATGGGGC 


TAGTCACACT 


GAGGATATTC 


2640 


CAGTTAGACC 


AGTCAAGAGC 


TGGCGCCGAT 


TGCCCTGCTG 


TATAGATCAG 


AGCTGCAGCT 


2700 


TCGCCAAAGA TACGACCAGA TGCCAAGACG ACACCCGTTA 


CAATACCTGG 


AAGCGCTTCC 


2760 


GGAATAACAA 


CATGAACCAC 


TGTCTCCCAG 


CGAGAAATCC 


CAAGAGCCAG 


ACCAGCCTCA 


2820 


CGTTGGGTAT GGTGAACGTG TTTCAAACTA TCCTCTACAT TACGCGTCAT 


CTGAGGCAAG 


2880 


TTAAAGACTG 


TCAAGGCCAA 


GGCACCTGAA 


ATGATTGAAA 


ATCCATACTC 


AAACTGGACT 


2940 


ACAAAGATCA AGTAACCAAA GAGACCCACC ACCACTGATG GTAAAGAGGA 


CAAAATTTCA 


3000 


ATACAAGTCC 


GCACAAAGTT 


GGTAACAGGA 


CCTTTTTTAG 


CATATTCAGC 


CAAGTAAATC 


3060 


CCAGCTCCCA TAGAAAGAGG TACAGAAATA ATCAAGGTAA TGACCAATAG GAAAAAGGAA 


3120 


TTGTAAAGCT 


GAATGCCAAT 


CCCACCACCT 


GCTTGAAAAG 


CAGAAGACCT 


TCCAGTCAAG 


3180 
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AAAGACCAAG AGATATGGGG CAAGCCCCGA 


ACCAAGATAT 


AGAGAATCAA 


. GGAAGCCAAG 


3240 


ATTGTCACAA TGATGCTAGC AATCGTATAG 


AGGACAGCTG 


TTGCAAGTTT 


ATCTAATTTC 


3 300 


TTAGCGCGCA TAATTTTTCT TTCCTCTTTC 


TTTCGTAATC 


AATTTAATCA 


CACTGTTAAA 


3360 


AACTAAGCTC ATCAAGAGCA GTACCAAGGC CAGTGACCAG AGAACATTAT TATTTACAGT 




TCCCATGACA GTGTTCCCAA TTCCCATAGT 


TAATATAGAA 


GTTAAAGTTG 


CAGCTGGTGT 


3480 


GGTCAAGGAA GTTGGGATAA CAGCTGAGTT 


TCCGACAACC 


ATCTGGATAG 


CTAGAGCCTC 


3 540 


ACCAAAGGCA CGCGCCATCC CAAAGACCAC 


TGCAGTGAAA 


ATACCAGAAC 


GGGCCGCCTT 


3 600 


CAAGATCACA CGCCAGATAG TCTGCCAGCG 


AGTGGCTCCC 


ATAGCGAAAC 


TGGCTTCACG 


3660 


ATAATAACGA GGAACCGCAC GCAAGCTATC CGTTGTCATA 


AAGGTTACGG 


TCGGCAAAAT 


3720 


CATGACAAAG AGGACGGAAA TCCCTGACAA 


AATCCCAAAA 


CCAGTCCCAC 


CAAAGACACT 


3780 


GCGAACAAAG GGAACGACGA CTTGCAAGCC 


AATAAATCCG 


TACACTACTG 


AAGGAATCCC 


3840 


AACCAGGAGT TCAATAGCTG GTTGCAAAAT 


CTTCGCCCCT 


TTTGGTGATA 


CTTCGGTCAT 


3900 


AAAAACTGCT GCACCAATAG CAAAGGGTGT 


TGCGATAAGG 


GCTGAGAGAA 


TGGTAACGAT 


3960 


AAAGGAACCC AAAATCATAG GAAGGGCACC AAATTCTTTA 


CTAGAAGGAT 


TCCAAGTTCC 


4020 


TCCCAAAAGA AAGTCAAAGA TATTCACACC 


ATTGACAAAG 


AAGGTCGACA 


AGCCTTTTTG 


4080 


CGCTACGAAA ACCAAAATCA TGGCCACAAG GATGACTATC AAAGAAAGAC 


AGGCAAAGGT 


4140 


CAAACCTTTT CCTAATTTCT CCAGACGAGA 


ATTCTTTGAT 


GGAAGCAACA 


TTTTCTTAGC 


4200 


TAATTCTTCT TGATTCATTA TTGTCTCCCT TCCAACACTG 


TCACAGTTCC 


GGCAGCATCT 


4260 


TTTTCAACCT TCATTTCCTT AATCGGAATA 


TACTTCAATC 


CTTTGACAAT 


CCCTTCTTGG 




GTCTCATCCG AGAGAACAAA ATTGAGAAAT 


TCTGCAGCCA 


ACTCATTGGG 


CTGCCCCAAT 




GTATACATAT GCTCATAAGA CCACAAGGGC 


CAATTATTGC 


TACTTATATT 


TTCTGGACTT 


A A A ft 


AAGTCATAGC CATTCAACTT CATGCTTTTG 


ACCGAATCAT 


CTATATAGGT 


AAGAGATAAA 




TAAGAGATAG CTCCTGGACT TTTTGATACG 


ATTGATTTTA 


CCGCTCCATT 


TGAATCCTGC 


4560 


TCCTGACTTT GCATGGCAGA CTGACCTTCC 


ATAATGACAG 


TATCAAAGGT 


AGCACGAGAG 


4620 


CCAGAGCCGG CTGCCCGATT GATAACAGAG 


ATGGGTAAGT 


CCTTACCACC 


AACCTCTTTC 


4680 


CAATTGGTTA CCTCACCTAT GAAGATTTGA 


CGAAGTTGCT 


CTGTCGTTAG 


GTTATCAACA 


4740 


TCAACCTCCT TATTGACAAT CAGAGCCAAG 


CCAGCTACCG 


CGACCTTGTG 


GTCAACAAGA 


4800 


GCAGAAGCAT CAATTCCGTC TTTTTCCTCA 


GCAAATACAT 


CTGAGTTTCC 


TATATCAACT 


4860 


GCCCCAGACT GAACCTGGGA CAAGCCTGTA 


CCAGAACCTC 


CCCCTTGGAC 


ATTGACCGTT 


4920 


TTTCCAACAT GGATCGTGCC AAATTCATCT 


GCCGCTACTT 


CAACCAAGGG. TTGCAAGGCA 


4980 
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GTTGAGCCAA CAGCCGTTAT GGATTCTCCA CGATCAATCC AGCTAGCACA GCCTACTAAA 
CAAGCCGTCA GCCAAAAAGC GATAAGAGAC AGAGCAAGCT TTTTTCTTTT TTTCACTGTT 
TTTCTCCTCG AAAATAATTA TGAATACTGT GAATTTTTTA AGTAGTTCTT TATGAGTTGA 
CGCATGAATT CTTACCAAAT TTCTGCGCAA TTGATTATTT ATATAATATA GGCTATATTA 
CTCTTTCCTA ACCTCCTTTT TTCATATGTG GATAAAATCT CTTGTCTATC CCTTCCCCCA 
TTGTCACCCA TTATAGTCAT TTCGTGTCTC TTTTTCCCCT TTTTAATGCA AGGGAAATTA 
CTCTCCTTAG ATGATAATCC AAAAGCTAGA AAGGTATCTC AAACCTCTCT ACTCTCCCAG 
ACTAGTTTAC AACTAAAAGG AAAAGATTCT ATTTTATGAG AAATCTAGTT TACAAGCGGT 
AAGAACGCTA ATAACTAAAC TTCTTGTACT CTTTGAAAAT CTCTTCAAAC CAGTGTTTTG 
AGCTATCTAT GGCTAGCTTC CTAGTTTGCT CTTTGATTTT CATTGAGTAG TAAAACTACA 
TGTAATGGCA ATCAAGATAT CAAGAATCAT CCTACTAAAA AAATCCATAC TTTCACTATA 
ACATAGAATA AGATATTTGA CTAGCATTTT CATTTGAATC TGAGGCCTTT TGGAAAATAA 
TTTTTCAAAA CATTTCCAGT AACCTTTGCA AAGCCCAAGC CATTGCCTTT AACCAAAACT 
TGGTACCAAC CATTTGGCAG ACTTTCTGCC AGCTGAACGG TTTCTCCAGC CGCATACTTG 
ACAAACGCTT CTTGGCCAAT TTCAACCGAC TGTTCGACCT GACTCGGTTT CAAGGCTAAA 
CCAAGAGCGA AACTGGGCTC AAAGCGTTTC TTCTTAAAAG TACCCAGATG CAGTCCATTG 
CGAGCAATCT TGAGCTTCCA TAAATCTGGC AAAAGTTCTG GCAAGAGATA AAGCTGGTCT 
CCAAAAATCT GCAAGATACC CGGTAGATTG ACCTTCAAAT GGTTTTGGGC AAATTCCTGC 
CACAAGGCAA CTTGTTCACG GCTGAGGTTA CTCTTACTTG CCTTAAATTT AGGAGCTGGA 
TTGTTACCCT TAAACTGTAG ATGGGCAACA AACTGACCCT CTCCCTTAAA CTGATGAGGA 
TACATCCGAG CCGTTTCTGG CAGGTCAATA CCAGCTACCA TTCCATTGAT ATGCTCTACT 
GGCAACAAGT CAAAATCATA CTCTTCCAGC AACCAATTGA CAATCTCTTC GTTTTCCTCG 
GGTGCCCAGG TACAGGTCGA ATAAACCAGA TGACCACCTT CAGCTAACAT GGTCACTGCA 
TCCTCCAGAA TTTCTCTTTG CAAGCTAGCA CATTGACTCG GATAATCTAA GCTCCAATAG 
TCCATAGCAT CAGGTTGCTT ACGAAACATT CCTTCACCAG AGCAAGGGGC ATCAAGAACG 
ATTAAGTCAA AATAGCCTTT AAAGACCTTG ACCAAGCGGT CGGCAGATTC ATTGGTCACC - 
ACGACATTTG TCGCTCCAAA ACGCTCCATG TTTTCAACCA AAATCTTAGC CCGTTTGCTT 
GAAATTTCAT TGGAAnCAAG TAGCCCCTCC CCTGCTAGAT AGGCTGCCAG TTGAGTTGAT 
TTGCCCCCCG GTGCAGCAGC CAAGTCCAAG ACCTTCATAC CAGGACTGGG TTGGGCTACT 
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TGAGCCACCA TTTGAGCAGC AGGTTCTTGC GAATAAACTA AACCTGTAGC ATGCTCAGGC 6780 

GATTTCCCTG AAACCTTCCC ATAGTGGCCC CAAGGGGTTT GAGTAATGGC ATCAGAAAAG 6840 

GAAAGTTGCT CTTCTTTTAA GGGATTGACC CGAAAGGCCG AAACCGCTTC CTCCTCAAAA 6900 

GAGGCAAGAA AATCTCTTGC CTCATCTCCT AGTATCTCTT TATATTTTTC AACAAATCCT 6960 

TCTGGAAATT GCATTTAAGT TCTTTTCCTT TCGTAAATAT AGGACTGAAT TTCCTCCTGC 7020 

ATCTCAAGAG GCACCATCAT GACCGGCTGT CTGGTTTGAA AATCAGGAGC TTCACCAAAA 7080 

AGGGTCACAA CCCGATAGCC CAGACTTTCC CCTAAAATAC TAGCTGCGGC ATAATCCCAT 7140 

GGTTGCAGAT AAGTGAGATA GGTCAACAAA CGCCCTGACA AAATCTTGGC AAAACTAATG 7200 

GCCGCACTTC CATAGACACG AACACCAAGA ACCGCTCGGC TCAAATCAGC CAGCCCCCAT 7250 

TCATTQGTTT CCAGCATACC ACTATTCCCT GCAATGAGAA AATCTCCAAG TGGTTTAGTT 7320 

TTAAAAGGAG CTAGGGACCT ATCATTTAGA CAAACTGGAA ATTCCCCACC ACCGTGGTAA 7380 

CAATCCCCTT TGACCACATC ATAAATCAGA CCAAACTGTC CCTGACCATT TTCAAAATAA 7440 

GCCATCATAA CAGCAAAATC TTCCTGCTGG GCTACAAAAT TATTGGTACC ATCAATGGGA 7500 

TCAATGACCC AAACCTTGCC CTCTTGAACC GAGGCTCGCA GACAACCTTC TTCAGCACAA 7560 

ATCTTATCCT CAGGATAACG GGACAAAATC TCACCAACCA AGAGTTCCTG AACTTCTTTG 7620 

TCCAGTCTGG TCACCAAATC TGTTGGAGAG GACTTGG7TT CAACACGCAA GTCTTCCTGC 7680 

ATATGGTCAA GAATGTACTG ACCTGCTTTC TTAACAAGCT CTTTAGCAAA TTCAAATTTA 7740 

CTTTCCAAGA GAAATCTTTC CTTCCCCTTT TTCTTTGGGG 7780 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4820 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEOUENCE DESCRIPTION: SEQ ID NO; 19: 

GTAATGATAT AGGAACACCA GGTGACCTGA TGGGACGTCG TAAGCCTATG AACTACTAGC 60 

TGCTAAAGGC TTTAAAGATG GTATGGTACC ATATATCTCA AACCAATACG AAGAAGAAGC 120 

CAAACAAAAG GGCAAGACAA TCAATCTCTA CGGTAAAACA AGAGGTTTGG TTACAGATGA 180 

CTTGGTTTTG GAAAAGGTAT TTAATAACCA ATATCATACT TGGAGTGAGT TTAAGAAAGC 240 

TATGTATCAA GAACGACAAG ATCAGTTTGA TAGATTGAAC AAAGTTACTT TTAATGATAC 300 

AACACAGCCT TGGCAAACAT TTGCCAAGAA AACTACAAGC AGTGTAGATG AATTACAGAA 360 
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ATTAATGGAC GTTGCTGTTC GTAAGGATGC AGAACACAAT TACTACCATT GGAATAACTA 420 

CAATCCAGAC ATAGATAGTG AAGTCCACAA GCTCAAGAGA GCAATCTTTA AAGCCTATCT 480 

TGACCAAACA AATGATTTTA GAAGTTCAAT TTTTGAGAAT AAAAAATAGT GTCTACTATT 540 

AGGAAATAAA GTTTAAAAAG GTGATGAAGA ACAAACCAAG ATTCAAGCAG GAATTCCTAC 600 

TGATAATGAA GTAAGTTATG ATCTTATTTA TCAGCAGGAA ACTCTTCCTG CAACAGGTTC 660 

ATCAACTTCT GAGCTTACAG CTTTAGGCCT ATTAGCTGTT GGTAGTTTAG TTCTTTTGGT 720 

TCATAATATG ACGGGAACAG TTTTTTGCTC CCTCTGAAAA GTCATCATTT GATGGCTTTT 780 

TTCTATATAG GGTAAAAGAT AGGGTAAAAG GCTATCATCG GACAAAATAA AGAAGGCATG 840 

ATATAATATA AAGTAGATTT CTATGTCATA AAACAAGAAC TGTTTGGACA TCATTCATTT 900 

GAAAACTCTC TATGTTCAAA CAATAGTAAA ATAAAATAGG GGATCTAAAT CCTTGCTATG 960 

AAAGGAAAAA ACTCAATGGC TACTATTCAA TGGTTTCCTG GTCACATGTC TAAAGCTCGT 1020 

CGACAGGTGC AGGAGAATTT AAAATTTGTT GATTTTGTGA CGATTTTAGT AGATGCACGC 1080 

TTGCCTCTAT CTAGTCAAAA TCCTATGTTG ACCAAGATTG TTGGTGATAA ACCAAAACTC 1140 

TTGATTTTAA ACAAGGCCGA CTTGGCTGAT CCAGCAATGA CCAAGGAATG GCGTCAGTAT 1200 

TTTGAATCAC AAGGAATCCA GACGCTAGCT ATCAACTCCA AAGAGCAAGT GACTGTAAAA 1260 

GTTGTAACAG ATGCGGCCAA GAAGCTCATG GCTGATAAGA TTGCTCGCCA GAAAGAACGT 1320 

GGGATTCAGA TTGAAACCTT GCGTACTATG ATTATCGGGA TTCCAAACGC TGGTAAATCA 1380 

ACTCTGATGA ACCGTTTGGC TGGTAAAAAG ATTGCTGTTG TTGGAAACAA GCCAGGGGTC 1440 

ACAAAAGGTC AACAATGGCT TAAAACCAAT AAAGACCTGG AAATCTTGGA TACACCGGGG 1500 

ATTCTCTGGC CTAAGTTTGA GGATGAAACT GTTGCACTTA AGTTGGCATT GACTGGAGCT 1560 

ATCAAAGACC AGTTGCTTCC TATGGATGAG GTTACCATTT TTGGTATCAA TTATTTCAAA 1620 

GAACATTATC CAGAAAAGCT GGCTGAACGC TTCAAACAAA TGAAAATTGA AGAAGAAGCG 1680 

CCTGTGATTA TTATGGATAT GACCCGCGCC CTCGGTTTCC GTGATGACTA TGACCGTTTT 1740 

TACAGTCTCT TCGTGAAGGA AGTCCGTGAT GGCAAACTCG GTAACTATAC CTTAGATACA 1800 

TTGGAAGACC TCGATGGCAA CGATTAAAGA AATCAAAGAA TTCCTTGTGA CAGTCAAGGA I860 

GTTAGAAAGC CCTATTTTTT TAGAGCTTGA AAAGGATAAT CGCTCAGGAG TTCAAAAGGA - 1920 

AATCAGCAAG CGTAAAAGAG CCATTCAAGC TGAATTAGAT GAAAATTTGC GCTTGGAATC 1980 

CATGCTTTCT TATGAAAAAG AACTTTATAA GCAAGGATTG ACCTTAATTG CAGGTATTGA 2040 

TGAGGTTGGT CGTGGTCCTC TTGCTGGTCC TGTAGTCGCT GCGGCCGTTA TTTTATCTAA 2100 
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AAATTGTAAG ATTAAAGGTC TCAACGACAG CAAGAAAATT CCTAAAAAGA AACATCTGGA 2160 

GATTTTCCAA GCCGTTCAAG ACCAAGCCTT GTCGATTGGA ATTGGTATCA TAGATAATCA 2220 

GGTCATCGAC CAAGTCAACA TCTATGAAGC AACCAAACTA GCCATGCAAG AAGCAATCTC 2280 

CCAGCTCAGC CCTCAACCAG AGCACCTTTT GATTGATGCC ATGAAACTGG ACTTGCCCAT 2340 

TTCACAAACC TCCATTATCA AAGGAGATGC CAACTCCCTC TCTATCGCAG CAGCATCTAT 2400 

AGTAGCCAAG GTAACACGTG ATGAATTGCT GAAAGAATAC GATCAGCAGT TCCCTGGCTA 2460 

TGATTTCGCT ACTAATGCAG GATATGGCAC AGCTAAACAT CTGGAAGGCC TCACAAAACT 2520 

AGGAGTTACC CCAATTCACC GAACCAGCTT TGAACCCGTT AAATCACTGG TTTTAGGTAA 2580 

AAAAGAAAGT TAATTGAAAG GAAATAACAT GGAGGAACAG TCGGAAATAG TCCGTTCTAA 2640 

GAAAGAATTC GCCTTTGCAT CCAGCACTAT ACTATCCCAA GTTGGTCGAG GAATCATTGT 2700 

CGGCCTCATC GTTGGAATTA TCGTCGGATC CTTTCGTTTC TTAATTGAAA AGGGCTTCCA 2760 

CCTGATACAA GGAGTTTATC AAGATCAAGG GTACTTAGTG CGCAATCTTT TTGTACTGGT 2820 

TTTGTTTTAT ATACTCATCT GTTGGCTCAG TGCCAAACTA ACACGGTCAG AAAAAGATAT 2880 

TAAAGGCTCA GGAATTCCTC AAGTCGAAGC CGAACTGAAA GGCCTCATGT CCCTCAACTG 2 940 

GTGGGGCATT CTTTGGAAAA AATATGTGCT AGGTATTCTT GCTATTGCCA GTGGACTCAT 3000 

GCTGGGTCGA GAGGGACCCA GCATTCAACT TGGAGCAGTT GGTGGTAAAG GAATTGCCAA 3060 

GTGGCTCAAA TCCAGTCCAG TAGAGGAACG TTCCTTGATT GCCAGTGGAG CTGCAGCAGG 3120 

TTTAGCCGCA GCCTTTAATG CTCCTATTGC AGCACTTCTC TTTGTTGTAG AAGAAGTCTA 3180 

TCACCATTTT TCGCGCTTTT TCTGGGTCTC AACTCTAGCA GCCAGCATCG TAGCAAACTT 3240 

TGTGTCTCTA CTCATGTTCG GTTTGACACC AGTATTGGAT ATGCCAGATA ACATTCCTCC 3300 

CATGACCCTA GATCAGTATT GGATATATCT CGTCATGGGA ATTTTCCTTG GATTTTCAGG 3360 

TTTTCTCTAT GAGAAAGCTG TATTAAACGT TGGAAGAGTT TATGACTTGA TTGGTCAAAA 3420 

AATCCATTTG GATAGGGCTT ATTATCCCAT CTTGGCTTTT ATCCTTATCA TACCAGTCGG 3480 

AATCTTCTTA CCTCAAATCA TTGGTGGCGG AAATCAGCTT GTCCTTTCTT TAACTGAACA 3540 

AAATTTTAGT TTCCAAGTTT TATTAGCTTA CTTTTTAATC CGCTTTATTT GGAGTATGAT 3600 

TAGCTATGGA AGTGGACTGC CAGGAGGAAT TTTCCTCCCC ATTTTAGCTC TTGGTTCTTT 3660 

GCTTGGTGCC TTAGTTGGTG TTATCTGTGT CAATCTTGGA CTTGTCAGTC AAGAGCAATT 3720 

CCCTATATTT GTCATTCTAG GAATGAGTGG CTATTTTGGA GCCATATCAA AAGCTCCCTT 3780 

AACCGCTATG ATCCTCGTAA CTGAGATGGT AGGAGATATT CGCAACCTTA TGCCACTTGG 3840 

TCTTGTCACT CTTGTTTCTT ATATTATCAT GGATTTGCTC AAAGGTACGC CAGTCTATGA 3900 
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AGCCATG CTG GAAAAAATGC TTCCAGAAGA AGTATCTAGC GAAGGAGAAG TTACACTTAT 3960 

CGAAATACCA GTTTCTGATA AAATTGCTGG GAAACAAGTT CATGAACTCA ACTTACCACA 4020 

CAACGTCCTC ATCACAACTC AAGTCCATAA TGGCAAGAGC CAAACAGTTA ACGGCTCAAC 4080 

CAGAATGTAT CTGGGTGATA TGATTCACCT GGTTATTCCA AAAAGTGAAA TTGGAAAAGT 4140 

CAAAGATTTG TTGTTGTAGT ATGAGTATTT ACATAATTTA TGTTATGTAA ATGATCAGTT 4200 

TGATTTATTT AGAAAACCGA TTCTCAGGAA TGAGATCGGT TATTTTTTAC TGATGAGGAA 4260 

TTTTACATAT AAATAATTGA ACTTTATTAA AAATAAGACT ATAATTAAGT TAGAAATGAT 4320 

AAAGTATAAA GCTAGAAAGG AGTTTACTGT ATCAAATCTG TACAGTAAGA TTAAAATCAT 4380 

GAAAAAGAAA ACAATAGCAA TTATATAGAG AAATGAAATA GAAATAGGAT AAAACAATCA 4440 

GGACAATCAA ATCAATTTCT AGCAATGTTT TAGAAGTCCA GATGTACTAT TCTAGTTTCA 4500 

ATCTATTATA CAATGTGTTT TGTATCTCAT AGCTCCTTAT ATAGCTCTTC AGTTATGTAG 4560 

TATTAACAGA AGTTTAGTGG GTGAGATTTT TATTATTTTC CTTATTCTGT TTTGTTTGTA 4620 

GGTCTAAGTC TTTTTATCAC TTTGAAAAAC TCCTATAACA TCTTTCCGAA AAACTATAAT 4680 

TTTCTTGAAA AATATACAAG TCTATGCTAT ACTACTAGTA TACTTACTTA TGGAGAAAAT 4740 

ACATGAAACG TGAGATTTTA CTGGAACGAA TCGACAAACT AAAACAACTC ATGCCCTGGT 4800 

AAGTTCTGGA ATACTACCAA 4820 
(2) INFORMATION FOR SEQ ID NO: 20: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21338 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

CTACGACATC ATGATTAACA GTCATGCGCT ACTACCAACT GAGCTATGGC GGATAAAATA 60 

GTCCGTACGG GATTCGAACC CGTGTTACCG CCGTGAAAAG GCGGTGTCTT AACCCCTTGA 120 

CCAACGGACC TTCTATCTGT AGCAGATATA ACCATTATAT CAATTTCTTG CTAATTGTCA 180 

ATCACTTTTG AGATTTTTTC TCTAAAATAT CTTTTAATTT TCTAATTTTT AATCTTGAAA - 240 

TAGGACAACG ATGGTCTTCA TAGAAAACAA TTTCTAAGTT TTTTCGATCA ATTTCTCTGA 300 

TATTACCTAT ATTTACCAAA AATGACTTGT GAGGAGAATA AAATCGCTGA GTATGTTTGT 360 

CCTTTTCCTG AATATCTGTC ATGGTACCAT AAAACTCTTT TGCAAAATTC TTACCAATAA 420 
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TGCGCAATTT ATGAGATACC CCTGTTGTTT CAATATACAA AATATCATGG TAAGGAATTT 480 

TTAAATCATT TCCCTTGTAA TTGTAGTCGA AATAATCTAC AACATCTTCA TTTTCAAGTA 540 

ACATACTCTT CGTGTAGAAG ATATTTTGCT CAATTCTCTT CTTAAACATC TCATCATTGA 600 

TATCCTTATC AACAAAATCT AGGGCTGATA CCTGGTATTT ATAGGTTAGA GTCGCAAACT 660 

CTGATCGACT AGTGATAAAG ACGATAATAG CGTAAGGATT GTAATGACGA ATGAGCTGAG 720 

CCACTTCAAA TCCCTTTTTC TCAATTCCAT GAATATCGAT ATCTAGGAAA TAAAGCTGAT 780 

TTACTTCATC ATTTTCAATG TATTCTTCAA ATTCACGGAC TTTTCCCGTT GTCTTGTATG 840 

ATATTGGAAT ATTCGATTCT TTCGAAATTT CATCCAATAT TCTCTCTAGT CTCACTTGAT 900 

GTTCAATAAC ATCTTCTAAA ATTAAAACTT TCATTCAAAT TCCCTCTTAA ATCTAATGAT 960 

TTGTCTAAAT GTACTGCCTT CCATCTCTGT TTCTAAAATA ATATTGTTGT ACTTATCTAG 1020 

TAGTTCTTTC ACATTATTTA ATCCGACTCC GCGATTTCTT CCCTTAGTGG AGAATCCTAA 1080 

GGCAAATAGA TCTCCTGAAG GAGTCATCGT CATTTTACAT GAATTCTGAA TCACAATAAC 1140 

TGTTTCAGTT TCCATCTTAA TAACTGCTAC TTCCATCTGC TTTTTATAG C TATCAGCCGA 1200 

TCCTTCGACA GCATTATTCA ATAAAACGCT CATGATACGA ACCAAATCCA ATAGTTCAAT 1260 

TGGAAGCTTG GTAATCGTAT CTTTTACTTC CAGTGTAAAC TCTACACCAT TATTTCGAGC 1320 

ATAGACAATT GACTGAGCAA CCAAACTTCG TAAAGCTGAG TCTTCTATGT TGTTCAAATC 1380 

AAAGTAAGTG TACTTATCTG AACGCAATTT ATGATTTGCT TTGACTAAAA CTTCATTGTA 1440 

AATTCTGTCA ATTTCCTGTA AATTACCACT GTCAATTGCC ATCTGCATGC TGACAAGCAT 1500 

TCCAGCATAA TCATGTCGAA AACCACGGAT TTCATTATAC AGACCAACAA TTTCATCTGT 1560 

GTAATTCTGT AAATGTTTCT GTTCAAATTT CTTCTGCTTC AAAGCAATCT CTTTCTCCAT 1620 

TTGAACTTTA TGAGAATTCA TTGCAAAGAA GGTCAAAAGG AGAGAGATAA AGACAATAGA 1680 

TGACAAAATA CTTCCAAAAC TATTCAAATG TTTAATCGTA CTTACCATAT CTGAAACGAA 1740 

AGATACAATA TGTAGCAATA GTAAAGCAAA AAATACTTTT TTCAAGAAAG GATAAAGGTA 1800 

GTCCTTGTCA AAATAGGCTA GTTCCAAATG GAAATAGTAA ATGATTTTTA ATGTAACAAA I860 

ATAGGTTAAC ACCGTCACAA CGAAAAAGAA TGGGAAATGA TATTGTAAAA CAAAATTATC 1920 

TCCTGTTATA GAGGAGAAAA TTACGGACAG AAAGTTATGA GTGCTCTCAT ATAAAAGAGA 1980 

TAGTAGTAAA CTTAGGAATA GTCCTCTATC CCTCTCATAC TGTTTCATCC ATCGAAAATA 2040 

GGAATATAAG CCCAAAGGAA ATAAAAATCT TTCAATCCCT ATTTTATCTA AATATAGAAG 2100 

ATAAAAGGAA AATTCAAGTA CTATTTCAGT TAGTAATGTA TAAGCACCAA AAACGTATAA 2160 

TTCTTTTCTA TTTATTCGAC CTTTACAAAT TAAACGGTAA CTGTGACTAA TAATTAAAAA 2220 
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ATGAACAATA ACTGTCCCAA ATCCAAGTAA ATCCATTACT 


CTTTCTCCTT 


ATTTCATTAC 


2280 


TTTTTTCGTA GGAAAAGAAA ATCAAGGATG ATTCTTGAAA 


TCCTCATCTC 


CCCACCTTTA 


2340 


ATCTTTTGTA AGTCTTTTTC CTTCAAAGCT ACAAACTGTT 


CCAATTTAAC 


TGTGTTTTTC 


2400 


ATAATAAAAT CTCCTAAAAT GTTTTTTCTT GTAAGCTAAC 


TTACAAAAAC 


CATTATACAA 


2460 


AATGGAATTT CGTTTTAGAT AAAATTCTCT CAACTGTCAT 


TTTTTTCTCC 


CAAAGTGTAC 


2520 


TTTTTTAAGA AAAAAGCCGG GAAAATTCCC AGCTTTGCTA 


TTATATTGAT 


CCCAGCAGGA 


2580 


TTCGAACCTG CGACCGTTCG CTTAGAAGGC GAATGCTCTA 


TCCAGCTGAG 


CTATGAGACC 


2640 


TAATACAATT ATTCTACCAA AAATTCAATT AAAAGTCAAT TTTCTATTTA TGGTAGGGGA 


2700 


ATCCCTGCTG AATCGTAAAA GCGCGATAGA TTTGTTCAAC 


AAGAACTAGT 


CTCATTAACT 


2760 


GATGGGGTAA GGTTAGGCGA CCAAAACTGA CAGAAAGATT 


GGCTCTATTT 


TTTACAGATG 


2820 


ATGATAATCC TAAACTTCCC CCAATAATAA AAGTAAGAGT 


AGAAAATCCT 


TTTATAGAAG 


2880 


TTTCTTCTAA CTGCTTACTA AATTCTTCTG AGAAGAAAGT 


TTTCCCTTCA 


ATGGCTAACA 


2940 


CAATAACGAA ATCACGGTCA GCAATTTTTG ATAAAATTCT 


CTGACCTTCT 


ATTTCTAAAA 


3000 


TCTTTTGATT TTCTGATTCA CTGGCCTTAT CTGGTGTTTT 


TTCATCTGAT 


AACTCAATCA 


3060 


TTTCAAACTT AGCAAATCTA GAAATTCGTT TTGAATACTC 


TGCGATACCA 


TCTTTTAAAT 


3120 


ACTTTTCTTT CAGTTTCCCA ACTGTTACAA CTTTAATTTT 


CATGACTCTA 


TTCTAACATA 


3180 


TTCTCTATTT TTTCACATCT TATTCACAAA ATAAAAAATA 


GATTTCAATT 


AAGAAAATCA 


3240 


CAATTTCAAA AGAGTTATCC ACAGTTTGTG TAAAACTTTT 


GTGTTTAAGT 


TATAATTAAG 


3300 


CTAGTCAGTT TATACTTTCA GTAATTCAAA CATATGGAGG 


CAAATATGAA 


ACATCTAAAA 


3360 


ACATTTTACA AAAAATGGTT TCAATTATTA GTCGTTATCG TCATTAGCTT 


TTTTAGTGGA 


3420 


GCCTTGGGTA GTTTTTCAAT AACTCAACTA ACTCAAAAAA 


GTAGTGTAAA 


CAACTCTAAC 


3480 


AACAATAGTA CTATTACACA AACTGCCTAT AAGAACGAAA ATTCAACAAC 


ACAGGCTGTT 


3540 


AACAAAGTAA AAGATGCTGT TGTTTCTGTT ATTACTTATT 


CGGCAAACAG 


ACAAAATAGC 


3600 


GTATTTGGCA ATGATGATAC TGACACAGAT TCTCAGCGAA 


TCTCTAGTGA 


AGGATCTGGA 


3660 


GTTATTTATA AAAAGAATGA TAAAGAAGCT TACATCGTCA CCAACAATCA CGTTATTAAT 


3720 


GGCGCCAgCA AAGTAGATAT TCGATTGTCA GATGGGACTA AAGTACCTGG 


AGAAATTGTC - 


3780 


GGAGCTGACA CTTTCTCTGA TATTGCTGTC GTCAAAATCT 


CTTCAGAAAA AGTGACAACA 


3840 


GTAGCTGAGT TTGGTGATTC TAGTAAGTTA ACTGTAGGAG 


AAACTGCTAT 


TGCCATCGGT 


3900 


AGCCCGTTAG GTTCTGAATA TGCAAATACT GTCACTCAAG 


GTATCGTATC 


CAGTCTCAAT 


3960 
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AGAAATGTAT 
GATACTGCTA 
ATCGGAATTA 
TTCGCAATTC 
GTGACGCGTC 
ATCAGAAGAC 
AGTAATATGC 
AAAGAGATTG 
ACCATTAAGA 
AAGAGTTCAG 
GAAAAGATGT 
CAAAAAAATC 
CAGTCTATCA 
GGTTATGAAA 
TCTATCCCAG 
GAAAATTTAC 
GTAGAGAAAG 
ATCAGCAACT 
GGCAAACTAT 
TATTTCTTTC 
ACAGAGAAAA 
CAGTTAAGAA 
AAAATCATTA 
TAAGGCTGTT 
GCTTAATAAA 
TGTGGATTAT 
TATTAAACTT 
GTATATTAGA 
AAGCTGAACT 
TGGAAATGGT 



CCTTAAAATC 
TTAACCCAGG 
CCTCAAGTAA 
CTGCAAATGA 
CAGCTTTGGG 
TCAATATTCC 
CTGCCAATGG 
CTTCATCAAC 
TAACCTACTA 
GTGATTTAGA 
GTTAGTGTAG 
CCTATCAACC 
AAGAAAATGG 
TCcTTGCAGG 
CTGTTGTTAA 
AGAGAGAAAA 
GATTCACCCA 
CCATTCGTTT 
CACAAGCCCA 
AACGGATTAT 
AACAAAAGAA 
AACTACTCGG 
TTTCTTTTTC 
CTTTTATTTT 
TCAATAATTT 
TTTTCACAGC 
TTAAATAGTA 
ATTTGCACAA 
CATCAAGGTA 
CTGGGAAAAA 



GGAAGATGGA 
TAACTCTGGC 
AATTGCTACA 
TGCTATCAAT 
AATCCAGATG 
AAGTAATGTT 
TCACCTTGAA 
AGACTTACAA 
TCGTAACGGG 
ATCTTAATTG 
AATCATGGAA 
CCGAAAAGAA 
GGTCATTCAA 
AGAGAGACGC 
ACAGATTTCA 
TTTAAACCCA 
TGCTGAAATT 
ACTTTCCTTG 
TGCGCGTTCC 
AGAAGAAGAT 
ACAGCAAAAA 
ATTAGATGTA 
AAATCAAGAA 
TTTATCTCAC 
CTTCTTTTAT 
TTGTGGAAAA 
AAGGAGGAGA 
GAAAGACTGA 
GAGGAAAATG 
CAACTAAAAG 



260 
CAAGCTATTT 



GGCCCACTGA 
AATGGAGGAA 
ATTATTGAAC 
GTTAATTTAT 
ACATCTGGTG 
AAATACGATG 
AGTGCTCTTT 
AAAGAAGAAA 
ACATCTATGT 
AAATTTGAAA 
TTTGATAGAG 
CCGATTATTG 
TATCGGGCTT 
GACCAAGAGA 
ATAGAAGAAG 
GCAGATAAGA 
CCAGAACAGA 
CTAGTTGGGT 
ATTTCTGTAA 
ACTAATCATT 
GAAATTAAAC 
GAATATAGTA 
AAGGTTATCC 
CCCCAACCTG 
TTCTTGCTAT 
AAGGATTGAA 
CTCGATCCAT 
TTGCCACTAT 
ATATTATTGT 



CTACAAAAGC 
TCAATATTCA 
CATCTGTAGA 
AGTTAGAAAA 
CTAATGTGAG 
TAATTGTTCG 
TAATTACAAA 
ACAACCATTC 
CTACCTCTAT 
AAAGAAAGCT 
TGATTTCTAT 
AAAAACTAGA 
TTCGTCAATC 
CACTTTTAGC 
TGATGGTCCA 
CACGCGCCTA 
TGGGCAAGTC 
TTCTTTCAGA 
TAAATAAGGA 
GGAAATTAGA 
TCATACAAAA 
TATCTAAAAA 
GAATTATCAA 
ACTATGTTTT 
TGGATAAAGT 
CTATGGTAAA 
AGAAAAACAA 
GTATGATTTC 
ATTTCTACCT 
AGTAGCTGGT 



CATCCAAACT 
AGGGCAGGTT 
AGGTCTTGGT 
AAACGGAAAA 
TACAAGCGAC 
TTCGGTACAA 
AGTAGATGAC 
TATCGGAGAC 
CAAACTTAAC 
TTACATAAGA 
CACAGATATA 
TGAACTAGCA 
TCCTGTTATT 
TGGTCTACGG 
GTCCATTATT 
TGAATCTCTC 
TCGTCCATAT 
AGTAGAAAAT 
ACAACAAGAC 
AGCTCTTCTG 
TGAAGAAAAA 
AGACAGTGGA 
CAGCCTGAAA 
TCGATAAAAA 
TTGGTAACAT 
ATATCTCTAG 
TTTTGGAATC 
TATGCTATTC 
CGCTCTGAAA 
TTTGAAATTT 



4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 
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ATGACGCTGA AATAACTCCC CACTATATTT TCACCAAACC 
AAGTTGAAGA AGCTACAAAT TTAACTCTTT ATAACTATAG 
CTTATTCAGA TACGGGATTA AAAGAAAAGT ATACCTTTGA 
GAAATGTTTG GGCTGTATCA GCCGCTTTAG CTGTCTCTGA 
ACCCTCTTTT TATCTATGGA GGACCAGGCC TTGGTAAGAC 
GAAATGAAAT TCTAAAAAAT ATTCCTAATG CGCGTGTTAA 
TTATTAATGA CTTTCTTGAT CACCTAAGAC TTGGGGAAAT 
ATCGTAGTCT TGATCTTTTG TTAATCGATG ATATCCAGTC 
CAACTCAGGA AGAATTTTTC AATACCTTTA ACGCCCTTCA 
TCCTAACGAG TGATCGTAGT CCAAAACATC TAGAAGGGCT 
GTTTTAGTTG GGGATTGACA CAAACTATCA CCCCCCCTGA 
TTTTACAAAG TAAGACGGAA CATTTAGGCT ACAATTTCCA 
TAGCTGGGCA ATTTGATTCA AATGTTCGAG ATCTTGAGGG 
TAATTGCCAG AGTAAAAAAA ATCAAGGATA TCACTATTGA 
GAGCCCGCAA ACAAGATGTT AGCCAAATGC TCGTCATCCC 
AAGTTGGTAA CTTTTATGGT GTTAGTATCA AAGAAATGAA 
ATATTGTTTT GGCCCGTCAA GTAGCCATGT ATTTATCTAG 
TTCCAAAAAT TGGGAAGGAA TTTGGGGGAA AAGATCATAC 
CCAAAATAAA ATCTTTGATT GATCAAGACG ATAATTTACG 
AAAAGAAAAT CAAATAATTT GTGGATAACT TTTAGTTTTT 
TTTAAACAAG CTAAAAAACT TGATATGACT TGTTTAAAGG 
AGACTCTATT ATTACTATTA TCTTTCTAAT ACTAAAAATA 
TCATTTTTCA ATTAATAAAA ATTTATTTCT ACAAGCATTA 
TAGTTCTAAA AATGCCATTC CTATTTTATC AACAGTAAAA 
TATTACTTTA ATTGGTTCAA ATGGTCAAAT TTCAATTGAA 
TGAAGATGCT GGTTTGTTAA TTACTTCTTT AGGTTCGATC 
TATCAATGTA GTATCTAGTT TACCTGATGT AACTCTTGAT 
TCAAATTGTT TTAACCAGTG GCAAATCAGA AATTACCCTA 
ATATCCACGA ATCCAAGAAA TTTCAGCAAG CACTCCTTTA 



TCAAGATACG 
TCCAAAGTTA 
TAACTTTATT 
AGATTTGGCT 
TCACTTATTA 
ATATATCCCT 
GGAAAAGTTT 
ACTCAGCGGA 
TGACAAGCAA 
CGAGGAGAGG 
CTTTGAAACA 
AAGTGATACT 
AGCCATCAAC 
TATTGCTGCA 
AATTGATAAA 
GGGAAGTAGA 
AGAACTAACA 
CACAGTCATT 
TTTAGAAATT 
TATCTTTTTT 
CTGTTTTCCA 
AATAAAGGAG 
AATACTACTA 
ATTGACGTGA 
AATTTTATTT 
CTTCTTGAAG 
TTTAAAGAAA 
AAAGGAAAAG 
ATACTTGAAA 



ACTAGCTCAC 
GTATCTATTC 
CAAGGGGATG 
CTGACCTATA 
AACGCTATTG 
GCCGAAAGCT 
AAAAAGACCT 
AAAAAAGTCG 
AAACAGATTG 
CTTGTCACGC 
CGTATTGCCA 
CTAGAATACC 
GACATCACTT 
GAAGCCATTA 
ATCCAAACTG 
CGCCTTCAAA 
GATAATAGTC 
CATGCCCATG 
GAATCAATCA 
ATCCACATTT 
CAGATTTCAC 
AATCCATGAT 
AGAGAGCTAT 
CCAATGAAGG 
CTCAAAAAAA 
CTTCTTTCTT - 
TTGAACAAAA 
ATAGCGAACA 
CAAAATTACT 



5820 
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5940 
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CAAGAAAATT ATTAATGAAA CAGCCTTTGC TGCAAGTACA CAAGAGAGTC GTCCGATTTT 7560 

AACAGGTGTC CACTTCGTAT TGAGTCAACA CAAAGAGTTA AAAACAGTTG CAACAGACTC 7620 

TCATCGCCTA AGCCAGAAAA AATTGACTCT TGAAAAAAAT AGTGATGATT TTGATGTCGT 7680 

AATTCCTAGC CGTTCTCTAC GCGAATTTTC AGCGGTATTT ACAGATGATA TCGAAACTGT 7740 

AGAGATTTTC TTTGCCAATA ACCAAATCCT CTTTAGAAGC GAAAATATTA GCTTCTATAC 7800 

TCGTCTCCTA GAAGGAAACT ATCCTGATAC AGATCGCTTG ATTCCAACAG ACTTTAACAC 7860 

TACTATTACT TTTAATGTGG TAAACTTACG CCAGTCAATG GAGCGTGCCC GTCTTTTATC 7920 

AAGTGCGACT CAAAATGGTA CTGTGAAACT TGAAATTAAG GATGGGGTTG TTAGCGCCCA 7980 

TGTTCACTCT CCAGAAGTTG GTAAAGTAAA CGAAGAAATC GATACTGATC AGGTTACTGG 8040 

TGAAGATTTG ACCATTAGTT TCAACCCAAC TTACTTGATT GATTCTCTTA AAGCTTTAAA 8100 

TAGCGAAAAG GTGACTATTA GCTTTATCTC AGCTGTTCGT CCATTTACTC TTGTGCCAGC 8160 

AGATACTGAC GAAGACTTCA TGCAGCTCAT TACACCAGTT CGTACAAATT AAGTGAAAGA 8220 

GGTTGAGCCT GGCTCGCCTC TTTTATGATA TAATCGAAAA AGAAAAGGAG AGTAGTATGT 8280 

ATCAAGTTGG AAATTTTGTT GAGATGAAAA AATCACACGC TTGTACAATC AAGTCGACTG 8340 

GTAAAAAGGC TAATCGTTGG GAAATTACAC GTGTAGGAGC AGATATCAAA ATAAAATGTA 8400 

GTAATTGTGA GCATGTTGTC ATGATGGGGC GATATGATTT TGAGCGAAAA ATGAATAAAA 8460 

TTATTGACTG AGAACCCTTA GTTAGAGGGT TAGCACTTTA TCCCTTTTTG TGTTATAATA 8520 

TTAGGGATTG AAATGAAAAC GGAGAATGAG AAATATGGCT TTGACAGCAG GTATCGTTGG 8580 

TTTGCCAAAC GTTGGTAAAT CAACACTATT TAATGCAATT ACAAAAGCAG GAGCAGAGGC 8640 

AGCAAACTAC CCATTTGCGA CGATTGATCC AAATGTTGGA ATGGTGGAAG TTCCAGATGA 8700 

ACGCCTACAA AAACTAACTG AAATGATAAC TCCTAAAAAG ACAGTTCCCA CAACATTTGA 8760 

ATTTACAGAT ATTGCAGGGA TTGTAAAAGG AGCTTCAAAA GGAGAGGGGC TAGGGAATAA 8820 

ATTCTTGGCC AATATTCGTG AAGTAGATGC GATTGTTCAC GTAGTTCGTG CTTTTGATGA 8880 

TGAAAATGTA ATGCGCGAGC AAGGACGTGA AGACGCCTTT GTAGATCCAC TTGCAGATAT 8940 

TGATACCATT AATCTGGAAT TGATTCTTGC TGACTTAGAA TCAGTGAACA AACGATATGC 9000 

GCGTGTAGAA AAGATGGCAC GTACGCAAAA AGATAAAGAA TCAGTAGCAG AATTCAATGT 9060 

TCTTCAAAAG ATTAAACCAG TCCTAGAAGA CGGGAAATCA GCTCGTACCA TTGAATTTAC 9120 

AGATGAGGAA CAAAAGGTTG TCAAAGGTCT TTTCCTTTTG ACGACTAAAC CAGTTCTTTA 9180 

TGTAGCTAAT GTGGACGAGG ATGTGGTTTC AGAACCTGAC TCTATCGACT ATGTCAAACA 9240 

AATTCGTGAA TTTGCAGCGA CAGAAAATGC TGAAGTAGTC GTTATTTCTG CGCGTGCTGA 9300 
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GGAAGAAATT TCTGAATTGA ATGATGAAGA TAAAAAAGAG TTTCTTGAAG CCATTGGTTT 9360 

GACAGAATCA GGTGTAGATA AGTTGACGCG TGCAGCTTAC CACTTGCTTG GATTGGGAAC 9420 

TTACTTCACA GCTGGTGAAA AAGAAGTTCG CGCTTGGACT TTCAAACGTG GTATGAAGGC 9480 

TCCTCAAGCA GCTGGTATTA TCCACTCAGA CTTTGAAAAA GGCTTTATTC GTGCAGTAAC 9540 

CATGTCATAT GAAGATCTAG TGAAATACGG ATCTGAAAAG GCCGTAAAAG AAGCTGGACG 9600 

CTTGCGTGAA GAAGGAAAAG AATATATCGT TCAAGATGGC GATATCATGG AATTCCGCTT 9660 

TAATGTCTAA AAATTAATAA ATGGTGTCAA TTAGGTTGGA AAAAAATTCC AACCCTTTTG 9720 

GCTTTTGAAA GGAAAAATAA ATGACCAAAT TACTTGTAGG CTTGGGAAAT CCAGGGGATA 9780 

AATATTTTGA AACAAAACAC AATGTTGGTT TTATGTTGAT TGATCAACTA GCGAAGAAAC 9840 

AGAATGTCAC TTTTACACAC GATAAGATAT TTCAAGCTGA CCTAGCATCC TTTTTCCTAA 9900 

ATGGAGAAAA AATTTATCTG GTTAAACCAA CGACCTTTAT GAATGAAAGT GGAAAAGCAG 9960 

TTCATGCTTT ATTAACTTAC TATGGTTTGG ATATTGACGA TTTACTTATC ATTTACGATG 10020 

ATCTTGACAT GGAAGTTGGG AAAATTCGTT TAAGAGCAAA AGGCTCAGCA GGTGGTCATA 10080 

ATGGTATCAA GTCTATTATT CAACATATAG GAACTCAGGT CTTTAACCGT GTTAAGATTG 10140 

GAATTGGAAG ACCTAAAAAT GGTATGTCAG TTGTTCATCA TGTTTTGAGT AAGTTTGACA 10200 

GGGATGATTA TATCGGTATT TTACAGTCTG TTGACAAAGT TGACGATTCT GTAAACTACT 10260 

ATTTACAAGA GAAAAATTTT GAGAAAACAA TGCAGAGGTA TAACGGATAA ATGGTGACCT 10320 

TATTAGATTT ATTCTCAGAA AATGATCAGA TTAAAAAATG GCATCAAAAT TTAACAGATA 10380 

AGAAAAGACA ACTAATACTT GGTTTATCAA CATCTACTAA GGCTCTTGCA ATTGCAAGCA 10440 

GTTTAGAAAA AGAAGATAGG ATTGTGTTAT TGACGTCAAC TTATGGAGAA GCAGAAGGAC 10500 

TTGTTAGTGA TCTTATTTCT ATCTTGGGTG AGGAACTCGT CTATCCATTT TTGGTAGATG 10560 

ATGCTCCTAT GGTGGAGTTT TTGATGTCTT CACAGGAAAA AATTATTTCA CGGGTTGAAG 10620 

CCTTGCGTTT TTTGACTGAT TCATCTAAGA AAGGGATTTT AGTTTGTAAT ATCGCAGCAA 10680 

GTCGATTGAT TTTACCGTCT CCCAATGCAT TCAAAGATAG TATTGTAAAA ATCTCAGTTG 10740 

GTGAAGAATA TGATCAACAC GCGTTTATCC ATCAGTTAAA GGAAAATGGC TATCGAAAAG 10800 

TTACTCAAGT ACAAACTCAG GGCGAATTTA GTCTTCGAGG AGATATTTTA GATATTTTTG' 10860 

AAATATCCCA GTTAGAACCT TGTCGAATTG AGTTTTTTGG TGATGAAATT GATGGTATCA 10920' 

GGTCATTTGA AGTAGAAACA CAATTATCGA AAGAAAATAA GACAGAACTC ACTATCTTTC 10980 

CAGCTAGTGA TATGCTTTTG AGAGAAAAGG ATTATCAACG AGGACAGTCA GCTTTAGAAA 11040 
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AACAAATTTC 


AAAAACTTTA 


TCACCTATTT 


TGAAATCATA 


CCTAGAAGAA 


ATTCTTTCAA 


11100 


GTTTTCACCA 


AAAACAAAGT 


CATGCAGACT 


CTCGGAAGTT 


TTTATCTTTG 


TGCTATGATA 


11160 


AGACATGGAC 


TGTCTTTGAT 


TATATTGAAA 


AAGATACTCC 


AATATTCTTT 


GATGATTATC 


11220 


AAAAATTGAT 


GAATCAGTAT 


GAAGTCTTTG 


AAAGAGACTT 


AGCGCAGTAC 


TTTACAGAAG 


11280 


AATTACAGAA 


TAGTAAAGCA 


TTTTCTGATA 


TGCAGTATTT 


TTCTGATATT 


GAACAAATCT 


11340 


ATAAAAAACA AAGTCCAGTG 


ACCTTTTTCT 


CTAATCTTCA 


AAAGGGTTTA 


GGAAATCTCA 


11400 


AATTTGACAA 


AATTTATCAA 


TTCAATCAAT 


ATCCTATGCA 


GGAATTTTTC 


AATCAGTTTT 


11460 


CTTTTCTAAA AGAAGAAATT 


GAACGATATA 


AAAAAATGGA 


TTACACCATT 


ATTCTGCAGT 




CTAGCAATTC 


AATGGGAAGT 


AAAACATTGG 


AGGATATGTT 


AGAGGAATAT 


CAGATTAAAT 


1 IjOU 


TGGATTCTAG 


AGATAAGACA 


AATATCTGTA 


AAGAATCTGT 


AAACTTAATA 


GAGGGTAATP 


± J. OfiU 


TCAGACATGG 


TTTTCATTTT 


GTAGATGAAA 


AGATTTTATT 


GATAACTGAA 


CATGAGATTT 


1 1 inn 


TTCAAAAGAA ATTAAAGCGT 


CGTTTTCGAA 


GACAACATGT 


TTCAAATGCA 


GAGAGATTAA 


11760 


AAGATTACAA 


TGAACTTGAA 


AAAGGGGACT 


ATGTTGTCCA 


TCATATCCAT 


GGGATTGGTC 


11820 


AATATCTAGG 


AATTGAAACC 


ATTGAAATCA 


AGGGAATTCA 


TCGCGATTAT 


GTCAGTGTCC 


11880 


AATACCAAAA 


TGGTGATCAA 


ATTTCTATCC 


CCGTGGAACA 


GATTCATCTA 


CTGTCCAAAT 


11940 


ATATTTCAAG 


TGATGGTAAA 


GCTCCAAAAC 


TCAATAAATT 


AAATGACGGT 


CATTTTAAAA 


12000 


AGGCCAAGCA 


AAAGGTTAAG 


AACCAGGTAG 


AGGATATAGC 


TGATGATTTA 


ATCAAACTCT 


12060 


ACTCTGAACG 


TAGTCAGTTG 


AAGGGTTTTG 


CTTTCTCAGC 


TGATGATGAT 


GATCAAGATG 


12120 


CCTTTGATGA 


TGCTTTCCCT 


TATGTTGAAA 


CGGATGATCA 


ACTTCGTAGT 


ATTGAGGAAA 


12180 


TCAAGAGGGA 


TATGCAGGCT 


TCTCAGCCAA 


TGGATCGACT 


TTTAGTTGGG 


GATGTTGGTT 


12240 


TTGGAAAGAC 


TGAAGTTGCT 


ATGCGTGCAG 


CCTTTAAAGC 


AGTCAATGAT 


CACAAACAGG 


12300 


TTGTCATTCT 


AGTTCCGACG 


1111 rWj 




LIA1 AV.OAAT 


TTTAAGGAAC 


12360 


GATTCCAAAA 


TTTTGCAGTT 


AATATTGATG 


TGTTGAGTCG 


CTTTAGAAGT 


AAAAAAGAGC 


12420 


AGACTGCAAC 


ACTTGAAAAA 


TTGAAAAACG 


GTCAAGTCGA 


TATTTTGATT 


GGAACACATC 


12480 


GTGTTTTGTC 


AAAAGATGTT 


GTGTTTGCTG 


ATTTGGGCTT 


GATGATTATT 


GATGAGGAAC 


12540 


AGCGATTTGG 


TGTCAAGCAT 


AAGGAAACTT 


TGAAAGAACT 


GAAGAAACAA 


GTGGATGTCC 


12600 


TAACCTTGAC 


CGCTACGCCA 


ATCCCTCGTA 


CCCTCCATAT 


GTCTATGCTG 


GGAATCAGAG 


12660 


ATTTATCTGT 


TATTGAAACT 


CCGCCGACTA 


ATCGCTATCC 


TGTTCAGACC 


TATGTTTTGG 


12720 


AAAAGAATGA 


TAGTGTCATT 


CGTGATGCTG 


TCTTGCGTGA 


AATGGAGCGT 


GGAGGTCAAG 


12780 


TTTATTATCT 


TTACAACAAA 


GTTGACACAA 


TTGTTCAGAA 


GGTTTCAGAA 


TTACAGGAGT 


12840 
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TGATTCCGGA GGCTTCGATT GGATATGTTC ATGGTCGAAT GAGTGAAGTC CAGTTGGAAA 12900 

ATACTCTATT AGACTTTATT GAGGGACAAT ACGATATCTT GGTGACGACT ACTATTATTG 12960 

AGACAGGGGT GGACATTCCA AATGCTAATA CTTTATTTAT TGAAAATGCG GACCATATGG 13020 

GCTTGTCAAC CTTATATCAG TTAAGAGGAA GAGTCGGTCG TAGTAATCGT ATTGCTTATG 13080 

CTTATCTCAT GTATCGTCCA GAAAAATCAA TCAGTGAAGT CTCTGAAAAG AGATTAGAAG 13140 

CGATTAAAGG ATTTACAGAA TTGGGCTCTG GCTTTAAGAT TGCAATGCGA GATCTTTCGA 13200 

TTCGTGGAGC AGGAAATCTT TTAGGAAAAT CCCAGTCTGG TTTCATTGAT TCTGTTGGTT 13260 

TTGAATTGTA TTCGCAGTTA TTAGAGGAAG CTATTGCTAA ACGAAACGGT AATGCTAACG 13320 

CTAACACAAG AACCAAAGGG AATGCTGAGT TGATTTTGCA AATTGATGCC TATCTTCCTG 13380 

ATACTTATAT TTCTGATCAA CGACATAAGA TTGAAATTTA CAAGAAAATT CGTCAAATTG 13440 

ACAACCGTGT CAATTATGAA GAGTTACAAG AGGAGTTGAT AGACCGTTTT GGAGAATACC 13500 

CAGATGTAGT AGCCTATCTG TTAGAGATTG GTTTGGTCAA ATCATACTTG GACAAGGTCT 13560 

TTGTTCAACG TGTGGAAAGA AAAGATAATA AAATTACAAT TCAATTTGAA AAAGTCACTC 13620 

AACGACTGTT TTTAGCTCAA GATTATTTTA AAGCTTTATC CGTAACGAAC TTAAAAGCAG 13680 

GCATCGCTGA GAATAAGGGA TTAATGGAGC TTGTATTTGA TGTCCAAAAT AAGAAAGATT 13740 

ATGAAATTTT AGAAGGTTTG CTGATTTTTG GAGAAAGTTT ATTAGAGATA AAAGAGTCTA 13800 

AGGAAGAAAA TTCCATTTGA TATTTTTCTT CTATAAAATA GATAAAAATG GTACAATAAT 13860 

AAATTGAGGT AATAAGGATG AGATTAGATA AATATTTAAA AGTATCGCGA ATTATCAAGC 13920 

GTCGTACAGT CGCAAAGGAA GTAGCAGATA AAGGTAGAAT CAAGGTTAAT GGAATCTTGG 13980 

CCAAAAGTTC AACGGACTTG AAAGTTAATG ACCAAGTTGA AATTCGCTTT GGCAATAAGT 14040 

TGCTGCTTGT AAAAGTACTA GAGATGAAAG ATAGTACAAA AAAAGAAGAT GCAGCAGGAA 14100 

TGTATGAAAT TATCAGTGAA ACACGGGTAG AAGAAAATGT CTAAAAATAT TGTACAAT r ?G 14160 

AATAATTCTT TTATTCAAAA TGAATACCAA CGTCGTCGCT ACCTGATGAA AGAACGACAA 14220 

AAACGGAATC GTTTTATGGG AGGGGTATTG ATTTTGATTA TGCTATTATT TATCTTGCCA 14280 

ACTTTTAATT TAGCGCAGAG TTATCAGCAA TTACTCCAAA GACGTCAGCA ATTAGCAGAC 14340 

TTGCAAACTC AGTATCAAAC TTTGAGTGAT GAAAAGGATA AGGAGACAGC ATTTGCTACG " 14400 

AAGTTGAAAG ATGAAGATTA TGCTGCTAAA TATACACGAG CGAAGTACTA TTATTCTAAG 14460 

TCGAGGGAAA AAGTTTATAC GATTCCTGAC TTGCTTCAAA GGTGATAAAA TGGAAAATTT 14520 

ATTAGACGTA ATAGAGCAAT TTTTGAGTTT GTCAGATGAA AAGCTGGAAG AATTGGCTGA 14580 
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TAAAAATCAA TTATTGCGTT TACAAGAAGA AAAGGAAAGG 
TTATTTTGTT GCTACCAAGT TTTTTGACCA TTTCAAAAGT 
TCGTCTATAC TTCGAAAGAA ATTTATTACC TTTCACAATC 
GAGAAAAATT AAGTTCTCCC ATGGTTTATG GAGAGGTTCC 
TAGTAGTGGA ATCTGGGAAA TTGACTCCCA AAACAAGTTT 
TAAATAAACA AGGAATTCCA GTATTTAAGC TATCAAATCA 
AACGATTTTT ATATGATCAA TCAGAGGTAA CTCCAACAAT 
CTGACTTTAA ACTGTACAAT AGTCCTTATG ATTTAAAAGA 
CTTATTCGCA AGTATCAATC GACAAGACCA TGTTTGTAGA 
TTGATCAGGC TGGATGGGTA GCTAAAGAAT CAACTTCTGA 
AAGTTCAAGA AATGTTATCT GAAAAATATC AGAAAGATTC 
AACTGACTAC TGGAAAAGAA GCTGGTATCA ATCAAGATGA 
TTTTGAAACT CTCTTATCTC TATTATACGC AAGAAAAAAT 
TAGATACGAC TGTAAAATAC GTATCTGCAG TCAATGATTT 
AGGGAAGTGG TAGTCTTCCT AAAAAAGAAG ATAATAAAGA 
TTACGAAAGT ATCAAAAGAA TCTGATAATG TAGCTCATAA 
CAAACCAATC TGATGCCACA TTCAAATCCA AGATGTCTGC 
ATCCAAAAGA AAAATTGATT TCTTCTAAGA TGGCCGGGAA 
ATCAAAATGG ATTTGTGCTA GAGTCTTTGA CTAAAACAGA 
CCAAAGGTGT TTCTGTTAAA GTAGCTCATA AAATTGGAGA 
ATACGGGTGT TGTCTATGCA GATTCTCCAT TTATTCTTTC 
ATTATGATAC GATTTCTAAG ATAGCCAAGG ATGTTTATGA 
AGATTTTTTA AATCATTTTC TCAAGAAGGG ATATTTCAAA 
AGCTCTTTCT GGTGGATTAG ATTCCATGTT TCTATTTAAG 
AGAGTTAGAG ATTGAATTGA TTCTAGCTCA TGTGAATCAT 
TTGGGAAGAA AAGGAATTAA GGAAGTTGGC TGCTGAAGCA 
CAATTTTTCA GGAGAATTTT CAGAAGCGCG TGCACGAAAT 
AGAGGTCATG AAAAAGACAG GTGCGACAGC TTTAGTCACT 
GGTGGAAACG ATTTTTATGC GCTTGATTCG AGGAACTCGC 
TAAGGAGAAG CAAGTAGTCG GAGAGATAGA AATCATTCGT 



AAGAATGCGT AAATTCTTAA 14640 

CGTTAGCACA GAAAAAGAAG 14700 

TGACTTTGGT ATTTATTTTA 147 60 

TGTTTATGCG AATGAAGATT 14820 

TCAAATAACC GAGTGGCGCT 14880 

TCAATTTATA GCTGCGGACA 14940 

AAAAAAAGTA TGGTTAGAAT 15000 

AGTGAAATCA TCCTTATCAG 15060 

AGGAAGAGAA TTTCTACATA 15120 

AGAAGATAAT CGGATGAGTA 15180 

TTTCTCTATT TATGTTAAGC 15240 

AAAGATGTAT GCAGCCAGCG 15300 

AAATGAGGGT CTTTATCAGT 15360 

TCCAGGTTCT TATAAACCAG 15420 

ATATTCTTTA AAGGATTTAA 15480 

TCTATTGGGA TATTACATTT 15540 

CATTATGGGA GATGATTGGG 15600 

GTTTATGGAA GCTATTTATA 15660 

TTTTGATAGT CAGCGAATTG 15720 

TGCGGATGAA TTTAAGCATG 15780 

TATTTTCACT AAGAATTCTG 15840 

GGTTCTAAAA TGAGGGAACC 15900 

AAGCATGCTA AGGCGGTTCT 15960 

GTATTGTCTA CTTATCAAAA 16020 

AAGCAGAGAA TTGAATCAGA 16080 

GAGCTTCCTA TTTATATCAG 16140 

TTTCGTTATG ATTTTTTTCA 16200 

GCCCACCATG CTGATGATCA 16260 

TTGCGCTATC TATCAGGAAT 16320 

CCCTTCTTGC ATTTTCAGAA 16380 
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AAAAGACTTT 


CCATCAATTT 


TTCACTTTGA 


AGATACATCA 


AATCAGGAGA 


ATCATTATTT 


16440 


TCGAAATCGT ATTCGAAATT 


CTTACTTACC 


AGAATTGGAA 


AAAGAAAATC 


CTCGATTTAG 


16500 


GGATGCAATC 


TTAGGCATTG 


GCAATGAAAT 


TTTAGATTAT 


GATTTGGCAA 


TAGCTGAATT 


16560 


ATCTAACAAT ATTAATGTGG 


AAGATTTACA 


GCAGTTATTT 


TCTTACTCTG 


AGTCTACACA 


16620 


AAGAGTTTTA 


CTTCAAACTT 


ATCTGAATCG 


TTTTCCAGAT 


TTGAATCTTA 


CAAAAGCTCA 


16680 


GTTTGCTGAA 


GTTCAGCAGA 


TTTTAAAATC 


TAAAAGCCAG 


TATCGTCATC 


CGATTAAAAA 


16740 


TGGCTATGAA 


TTGATAAAAG 


AGTACCAACA 


GTTTCAGATT 


TGTAAAATCA GTCCGCAGgC 


16800 


TGATGAAAAG 


GAAGATGAAC 


TTGTGTTACA 


CTATCAAAAT 


CAGGTAGCTT 


ATCAAGGATA 


16860 


TTTATTTTCT 


TTTGGACTTC 


CATTAGAAGG 


TGAATTAATT 


CAACAAATAC 


CTGTTTCACG 


16920 


TGAAACATCC 


ATACACATTC 


GTCATCGAAA 


AACAGGAGAT 


GTTTTGATTA 


AAAATGGGCA 


16980 


TAGAAAAAAA 


CTCAGACGTT 


TATTTATTGA 


TTTGAAAATC 


CCTATGGAAA 


AGAGAAACTC 


17040 


TGCTCTTATT 


ATTGAGCAAT 


TTGGTGAAAT 


TGTCTCAATT 


TTGGGAATTG 


CGACCAATAA 


17100 


TTTGAGTAAA 


AAAACGAAAA 


ATGATATAAT 


GAACACTGTA 


CTTTATATAG 


AAAAAATAGA 


17160 


TAGGTAAAAA 


ATGTTAGAAA 


ACGATATTAA 


AAAAGTCCTC 


GTTTCACACG 


ATGAAATTAC 


17220 


AGAAGCAGCT 


AAAAAACTAG 


GTGCTCAATT 


AACTAAAGAC 


TATGCAGGAA 


AAAATCCAAT 


17280 


CTTAGTTGGG 


ATTTTAAAAG 


GATCTATTCC 


TTTTATGGCT 


GAATTGGTCA 


AACATATTGA 


17340 


TACACATATT 


GAAATGGACT 


TCATGATGGT 


TTCTAGCTAC 


CATGGTGGAA 


CAGCAAGTAG 


17400 


TGGTGTTATC 


AATATTAAAC 


AAGATGTGAC 


TCAAGATATC 


AAAGGAAGAC 


ATGTTCTATT 


17460 


TGTAGAAGAT 


ATCATTGATA 


CAGGTCAAAC 


TTTGAAGAAT 


TTGCGAGATA 


TGTTTAAAGA 


17520 


AAGAGAAGCA 


GCTTCTGTTA 


AAATTGCAAC 


CTTGTTGGAT 


AAACCAGAAG 


GACGTGTTGT 


175B0 


AGAAATTGAG 


GCAGACTATA 


CTTGCTTTAC 


TATCCCAAAT 


GAGTTTGTAG 


TAGGTTATGG 


17640 


TTTAGACTAC 


AAAGAAAATT 


ATCGTAATCT 


TCCTTATATT 


GGAGTATTGA 


AAGAGGAAGT 


17700 


GTATTCAAAT 


TAGAAAGAAT 


AATCTTTAAT 


GAAAAAACAA 


AATAATGGTT 


TAATTAAAAA 


17760 


TCCTTTTCTA TGGTTATTAT 


TTATCTTTTT 


CCTTGTGACA 


GGATTCCAGT 


ATTTCTATTC 


17820 


TGGGAATAAC 


TCAGGAGGAA 


GTCAGCAAAT 


CAACTATACT 


GAGTTGGTAC 


AAGAAATTAC 


17880 


CGATGGTAAT 


GTAAAAGAAT 


TAACTTACCA 


ACCAAATGGT 


AGTGTTATCG 


AAGTTTCTGG - 


17940 


TGTCTATAAA AATCCTAAAA CAAGTAAAGA AGAAACAGGT 


ATTCAGTTTT 


TCACGCCATC 


18000 


TGTTACTAAG 


GTAGAGAAAT 


TTACCAGCAC 


TATTCTTCCT 


GCAGATACTA 


CCGTATCAGA 


18060 


ATTGCAAAAA 


CTTGCTACTG 


ACCATAAAGC 


AGAAGTAACT 


GTTAAGCATG 


AAAGTTCAAG 


18120 
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TGGTATATGG 


ATTAATCTAC 


TCGTATCCAT 


TGTGCCATTT 


GGAATTCTAT 


TCTTCTTCCr 


18180 


ATTCTCTATG 


ATGGGAAATA 


TGGGAGGAGG 


CAATGGCCGT 


AATCCAATGA 


GTTTTGGACG 


18240 


TAGTAAGGCT 


AAAGCAGCAA 


ATAAAGAAGA 


TATTAAAGTA 


AGATTTTCAG 


ATGTTGCTGG 


18300 


AGCTGAGGAA 


GAAAAACAAG 


AACTAGTTGA 


AGTTGTTGAG 


TTCTTAAAAG 


ATCCAAAACG 


18360 


ATTCACAAAA 


CTTGGAGCCC 


GTATTCCAGC 


AGGTGTTCTT 


TTGGAGGGAC 


CTCCGGGGAC 


18420 


AGGTAAAACT 


TTGCTTGCTA 


AGGCAGTCGC 


TGGAGAAGCA 


GGTGTTCCAT 


TCTTTAGTAT 


18480 


CTCAGGTTCT 


GACTTTGTAG 


AAATGTTTGT 


CGGAGTTGGA 


GCTAGTCGTG 


TTCGCTCTCT 


18540 


TTTTGAGGAT 


GCCAAAAAAG 


cagcaccagc 


TATCATCTTT 


ATCGATGAAA 


TTGATGCTGT 


18600 


TGGACGTCAA 


CGTGGAGTCG 


GTCTCGGCGG 


AGGTAATGAC 


GAACGTGAAC 


AAACCTTGAA 


18660 


CCAACTTTTG 


ATTGAGATGG 


ATGGTTTTGA 


GGGAAATGAA 


GGGATTATCG 


TCATCGCTGC 


18720 


GACAAACCGT 


TCAGATGTAC 


TTGACCCTGC 


CCTTTTGCGT 


CCAGGACGTT 


TTGATAGAAA 


18780 


AGTATTGGTT 


GGTCGTCCTG 


ATGTTAAAGG 


TCGTGAAGCA 


ATCTTGAAAG 


TTCACGCTAA 


18840 


GAATAAGCCT 


TTAGCAGAAG 


ATGTTGATTT 


GAAATTAGTG 


GCTCAACAAA 


CTCCAGGCTT 


18900 


TGTTGGTGCT 


GATTTAGAGA 


ATGTCTTGAA 


TGAAGCAGCT 


TTAGTTGCTG 


CTCGTCGCAA 


18960 


TAAATCGATA 


ATTGATGCTT 


CAGATATTGA 


TGAAGCAGAA 


GATAGAGTTA 


TTGCTGGACC 


19020 


TTCTAAGAAA 


GATAAGACAG 


TTTCACAAAA 


AGAACGAGAA 


TTGGTTGCTT 


ACCATGAGGC 


19080 


AGGACATACC 


ATTGTTGGTC 


TAGTCTTGTC 


GAATGCTCGC 


GTTGTCCATA 


AGGTTACAAT 


19140 


TGTACCACGC 


GGCCGTGCAG 


GCGGATACAT 


GATTGCACTT 


CCTAAAGAGG 


ATCAAATGCT 


19200 


TCTATCTAAA 


GAAGATATGA 


AAGAGCAATT 


GGCTGGCTTA 


ATGGGTGGAC 


GTGTAGCTGA 


19260 


AGAAATTATC 


TTTAATGTCC 


AAACCACAGG 


AGCTTCAAAC 


GACTTTGAAC 


AAGCGACACA 


19320 


AATGGCACGT 


GCAATGGTTA 


CAGAGTACGG 


TATGAGTGAA 


AAACTTGGCC 


CAGTACAATA 


19380 


TGAAGGAAAC 


CATGCTATGC 


TTGGTGCACA 


GAGTCCTCAA 


AAATCAATTT 


CAGAACAAAC 


19440 


AGCTTATGAA 


ATTGATGAAG 


AGGTTCGTTC 


ATTATTAAAT 


GAGGCACGAA 


ATAAAGCTGC 


19500 


TGAAATTATT 


CAGTCAAATC 


GTGAAACTCA 


CAAGTTAATT 


GCAGAAGCAT 


TATTGAAATA 


19560 


CGAAACATTG 


GATAGTACAC 


AAATTAAAGC 


TCTTTACGAA 


ACAGGAAAGA 


TGCCTGAAGC 


19620 


AGTAGAAGAG 


GAATCTCATG 


CACTATCCTA 


TGATGAAGTA 


AAGTCAAAAA 


TGAATGACGA 


19680 


AAAATAACCC 


TGAGAGAGGC 


TGGAGCCTCT 


CTTTTTTGTG 


CAGTTTAGGA 


GCTAAAGGGA 


19740 


ACAGAATGGA 


GAAAATGGAA 


CAAATGTGTT 


TTCTAATCTG 


TTAGACTGTA 


TCTAGAAAGG 


19800 


GGAAAATTAT 


GATTAAAGAA 


TTGTATGAAG 


AAGTCCAAGG 


GACTGTGTAT 


AAGTGTAGAA 


19860 


ATGAATATTA 


CCTTCATTTA 


TGGGAATTGT 


CGGATTGGGA 


GCAAGAAGGC 


ATGCTCTGCT 


19920 
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TACATGAATT GATTAGTAGA GAAGAAGGAC TGGTAGACGA TATTCCACGT 


TTAAGGAAAT 


19980 


ATTTCAAGAC CAAGTTTCGA AATCGAATTT TAGACTATAT CCGTAAACAG 


GAAAGTCAGA 


20040 


AGCGTAGATA CGATAAAGAA CCCTATGAAG AAGTGGGTGA GATCAGTCAT 


CGTATAAGTG 


20100 


AGGGGGGTCT CTGGCTAGAT GATTATTATC TCTTTCATGA AACACTAAGA 


GATTATAGAA 


20160 


ACAAACAAAG TAAAGAGAAA CAAGAAGAAC TAGAACGCGT CTTAAGCAAT 


GAACGATTTC 


20220 


GAGGGCGTCA AAGAGTATTA AGAGACTTAC GCATTGTGTT TAAGGAGTTT 


ACTATCCGTA 


20280 


CCCACTAGTA AGTCATGCAA AAAAAATGAA AAAAATTAGA AAAAGTAGTT GACAAAGTTT 


20340 


GAAAAGGCTG TATAATAGTA AGAGTTGAAA ATAACAACTC AGGTCCGTTG 


GTCAAGGGGT 


20400 


TAAGACACCG CCTTTTCACG GCGGTAACAC GGGTTCGAAT CCCGTACGGA 


CTATGGTATG 


20460 


TTGCGTCAGG ACCACTTGAT GAAAAAAAGT TTAAAAAAAC TTAAAAATCT 


TCAAAAAAGT 


20520 


GTTGACAAGC GAAAGCAGTT GTGATATACT AATATAGTTG TCGCTTGAGA 


GAAGCAAGTG 


20580 


ACAAAGACCT TTGAAAACTG AACAAGACGA ACCAATGTGC AGGGCGCTAC AACGTAAGTT 


20640 


GTAGTACTGA ACAATGAAAA AAACAATAAA TCTGTCAGTG ACAGAAATGA 


GTAAGAACTC 


20700 


AAACTTTTTA ATGAGAGTTT GATCCTGGCT CAGGACGAAC GCTGGCGGCG 


TGCCTAATAC 


20760 


ATGCAAGTAG AACGCTGAAG GAGGAGCTTG CTTCTCTGGA TGAGTTGCGA 


ACGGGTGAGT 


20820 


AACGCGTAGG TAACCTGCCT GGTAGCGGGG GATAACTATT GGAAACGATA 


GCTAATACCG 


20880 


CATAAGAGTA GATGTTGCAT GACATTTGCT TAAAAGGTGC ACTTGCATCA 


CTACCAGATG 


20940 


GACCTGCGTT GTATTAGCTA GTTGGTGGGG TAACGGCTCA CCAAGGCGAC 


GATACATAGC 


21000 


CGACCTGAGA GGGTGATCGG CCACACTGGG ACTGAGACAC GGCCCAGACT 


CCTACGGGAG 


21060 


GCAGCAGTAG GGAATCTTCG GCAATGGACG GAAGTCTGAC CGAGCAACGC 


CGCGTGAGTG 


21120 


AAGAAGGTTT TCGGATCGTA AAGCTCTGTT GTAAGAGAAG AACGAGTGTG 


AGAGTGGAAA 


21180 


GTTCACACTG TGACGGTATC TTACCAGAAA GGGACGGCTA ACTACGTGCC 


AGCAGCCGCG 


21240 


GTAATACGTA GGTCCCGAGC GTTGTCCGGA TTTATTGGGC GTAAAGCGAG 


CGCAGGCGGT 


21300 


TAGATAAGTC TGAAGTTAAA GGCTGTGGCT TAACCATA 




21338 


(2) INFORMATION FOR SEQ ID NO: 21: 







<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6273 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
<D) TOPOLOGY: linear 
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{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



TGTTTTTAAA 


GAGCCGTGTC 


TGGATAGACT 


TTCGGACGCA 


ACGCTCTATT 


AGATAATGAA 


60 


CTGCCTATAC 


ACAAGATTTC 


TAACCTTAGT 


CGACATGAGC 


TGAAACCTCT 


TATTTGTTAA 


120 


GTAGTTCACA 


AAATATTATA 


CACCTATTTT 


ATGAATAGTC 


AACTGTCTTT 


ACAGTAAAAT 


180 


TTTAGAAAAT 


CATGAAAATT 


TTCTCTTTCT 


TTCCATTTTA 


AGTGACATTC 


AGTCATTCTC 


240 


ACATCAAAAA 


AGCCCAGACG 


AAATTGTCTG 


AGCATTCTTT 


TATCTAGTCG 


TTTAAGGAAG 


300 


TTGAGTTCAG 


TATGTTTAAA 


GTCTCTGTCC 


CATCATTTCT 


TCAACAAACC 


TTGTTCTTGG 


360 


AGAAACTCCT 


TGGCTACTTG 


CTTTGCTGAC 


TTGCCTTCAA 


CACCGACTTG 


GTAGTTGAGC 


420 


TGGCTCATCT 


GGCTTTCTGT 


AATCTTACCA 


GCCAATGTAT 


TAAGAACTCT 


TTCCAACTCT 


480 


GGGTGTTTCT 


TGAGAAGAGC 


TTCTTTCATG 


AGTGGAGCCC 


CTTGATAAGG 


TGGGAAGAGT 


540 


TGCTTGTCAT 


CTTCCAAGAC 


CTGTAAATCA TAACGCTCCA ATTCCGCATC 


AGTCGAATAG 


600 


GCATCCGTGA 


TTTGAATATC 


CCCTGACTGA 


ATAGCCTGAT 


AGCGAAGGGC 


TGGCTCAATG 


660 


GTCGCTACAT 


TGAGATTGAG 


ACCATACATT 


GATTGCAAGC 


CCTTATTTCC 


ATCTTCACGG 


720 


TCGTTAAACT 


CGAGTGTAAA 


ACCTGCCTTC 


AACTGCCCTT 


CCACTTTTTT 


CAAGTCTGAA 


780 


ATGGTCTTCA 


AGCCATATTC 


TTGAGCAATC 


TTTTTCGGAA 


CAGCTACAGC 


ATAGGTGTTT 


840 


TGATAAGACA 


TGGGTTTGAG 


ATAGGCTAGA 


TGATCCTGCT 


TAGCAATGCC 


ATCACGCGCC 


900 


ACCTGATAAA 


CCTGTTCTGG 


TTCATGACTC 


ACCTTGGGTG 


ATGGTTGAAG 


CAAACTTTCA 


960 


GTCACCGTAC 


CAGTAAATTC 


AGGATAGATG 


TCAATATCGC 


CTTTTTTCAG 


AGCTTCATAA 


1020 


AGGAAGCTTG 


TCTTCCCAAA 


ATTCGGTTTA 


ACAGTCGCAG 


TCATGCTGGT 


ATTTTCTTCA 


1080 


ATCAGCAACT 


TATACATATT 


GGCCAAAATT 


TCTGGTTCTG 


GACCTATTTT 


CCCAGCAATA 


1140 


ACCAAGTTTT 


CCTTCTCTTT 


TTGAACCAAA AGAGCTGGAC 


TATAAGACAG 


ACCCAGTAAT 


1200 


AAAGCCACCA 


AGGCAAAACC 


TGAGAAAATC 


GTCCGTAATT 


TTGCTTTTTC 


CATC ACT TTT 


1260 


AGTAGGAAGT 


TAAAGGCAAT 


GGCTAGCACT 


GCAGAAGAAA 


GTGCCCCAAT 


CAAAATCAAA 


1320 


CTGGCATTAT 


TACGGTCAAT 


TCCCAAAAGA 


ATAAAGGAAC 


CTAGTCCCCC 


TGCACCAATC 


1380 


AAGGCCGCCA 


AGGTTGCCGT 


ACCGATAATC 


AAAACAGCTG 


CCGTCCGAAT 


CCCAGACATG 


1440 


ATAACAGGCA 


TGGCGAGTGG 


AATTTCAAAT 


TTCTTGAGAC 


GTTCCCATCT 


GGTCATCCCA 


1500 


AAGGCAATCC 


CAGCCTCTTG 


CAGGTTCGGA 


TCAATTCCCT 


TCAGCCCAGT 


GATAGTATTT 


1560 


TGCAAAATAG 


GGAAAATCGC 


ATAAATCACT 


AGAGCTGTCA 


AAGCCGGCAA 


GGTCCCAATT 


1620 


CCCATCAAAG 


GGATAAAGAG 


CCCCAACAAG 


GCCAGAGACG 


GGATGGTCTG 


GAAAATACCT 


1680 


GCAATCTGCA AGACCCAGTC GGCCAGCTTC TCATGATAGC GAAGAAAAAC 


AGCCAAGGGA 


1740 
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ATCGCAAGCA 


AAATAGCTAG TAACAAGGTC AAAAGCGACA 


ACTGCAAATG 


TTGAGATAGA 


1800 


GCTGTCAACC 


AATCACTAAA ACGATCCTGA AAAGTTGCAA 


TTAAATTAGT 


CATGAACACT 


1860 


ACCTCCAAAC 


AAGTCTGCTA CAAAGTCTGT TGCAGGCGCT 


TTTAAAATTG 


TCTCGGGATT 


1920 


CGCTACCTGG 


CGAATTTCTC CATCCTGCAA GACAGCAATA 


CGGTCCGCCA 


ACTTCAAGGC 


1980 


TTCATCCGTA 


TCATGGGTTA CAAAAATCGT TGTCATCCCA 


AACTCTTTAT 


GCAATTCTTT 


2040 


TGTCAGAACC 


TGCAACTGTT TTCTCGAAAT AGCATCCAAG 


GCCGAAAAGG 


GTTCATCCAT 


2100 


GAGGAAAATC 


TTGGGCTGAC CAATCATAGC TCGGACAATA 


CCGACCCGTT 


GCTGTTCTCC 


2160 


ACCAGATAAT 


TCACTAGGTA AGCGATGCCC ATACTCGGCT 


ACTGGTAAAC 


CAACCTTAGC 


2220 


CAAAAGCTCT 


TCTGTTTTCT TCGTAATTTC TTCCTTGCTC 


CACCCCTTCA 


TTTCAGGAAT 


2280 


GAGAGCAATA 


TTTTCCGCAA CTGTTAGATT TGGAAAAAGA 


GCAATAGCCT 


GTAAAACATA 


2340 


ACCAGTAGAA AGACGAAGTT CACGCTCATC ATAGTCTTTG 


ATGCGCTTCC 


CATCCATATA 


2400 


AATATTTCCA 


TCAGTTGGTT CCAAAAGACG GTTAATCATC 


TTGAGCATGG 


TCGTCTTACC 


2460 


TGACCCAGAA 


GGCCCTACTA AAACCATAAA TTCCCCATCC 


TCAATCTGTA 


AGTTGACATC 


2520 


TCTCAAGACA 


TCCTTTTCTG TGTAGCGCAG TGCTACATTT 


TTGTATTCAA 


TCATTCTTTG 


2580 


TCCTCAATTT 


AAAACTTCCC TCGATTGGTC AAGTCTTCTA 


CCTTAGGCAT 


AACTTCCTTA 


2640 


TTATCCCAAT 


GCTCCACAAT TTTCCCGTTC TCTAAACGGA AGATATCGTA CTGGGCATAA 


2700 


GCAACGCCAT 


CAATCTGAGT CTGACCATAG CTAACCACAT 


AGTTTCCTTG 


TCCTAAGAGT 


2760 


TGGAAAACAA 


AGTCAAAAGT GACACTATAT TCAGCCACAT 


AGTTTTTATA 


AGCAGCACTT 


2820 


CCTTGTCCAA 


TATCATGATT ATGCTGAATC AAATCGTCTG 


CCACATAATC 


ACTCCACTGC 


2880 


TCTAGCTCCC 


CATTTTGGAA AATTTCTGTC AAGAAACGGC 


GAACCAGCTT 


TTTATTTTCT 


2940 


GCTTTCTTAT 


CCAAATCCTT GATTTCAAAA TCTCCAAAAA 


TTTGATCTAG 


TTGGTCATTT 


3000 


TCAGGTGTTC GATAGTAGTC AATGACATCC CAATGCTCAA 


CAATACAACC 


ATTCTCATCC 


3060 


TCACGGAAAG 


TATCCGTCGT CACCCATTGA GCTTCTCCAC 


CATTCAGATA 


TTGATGAACA 


3120 


TGAACAAAGA CCAGATTGCC ATCCTCAATG GTGCGGACAA TCTTAATCTG 


ACGCTCTGGA 


3180 


TGACGCTCAA 


AGAAATCTGC AAAGAAGGCT GCAAATCCTT 


CTTTCCCGTC 


AGGAACACCT 


3240 


GTCGAATGTT 


GGATATAGGT ATCCCCTACA GACTGGGCTT 


GAGCCTCAGC 


AACTCGTCCG - 


3300 


TCTTGAATGG 


CATGGATGTA TAGGTTGTGA GCATTTTTCA 


CTTGTTGTGA 


CATATTCTAA 


3360 


ACCTCATTTC 


CCTTCTCTTT CAGATTCGCC AAAATTCTTT 


CTTGAAAACC 


TTCAAATTGG 


3420 


TGAATTTCTT 


CCTCTGAAAA TCCTTTGTAA AAGATAGTAT 


CCAATTTCTG ACTGACACGA 


3480 
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TGCCCCACTT 


CTTTCTGGGA 


CTTGCCTAAC 


TCCGTTAAAA 


CTAAATACTT 


CTTACGCTTG 


3540 


TCTTTTCCAC 


ACGGACTAAC 


AATTACAAGC 


TTTTGTTCCT 


CTAGCTTTTT 


TATCATAGTC 


3600 


GTCAGCGTAT 


TATTCGCAAG 


TCCAGTCGCA 


AGCGCGATAT 


CTGTCGCAGT 


TGCGCAGCCA 


3660 


GTTTCACTAT 


TCCATAAAAC 


CGCTAAAATC 


TTGCCCTGTT 


CACCCCTATA 


AAGAGCCTCA 


3720 


GGATCTTGAC 


TCAGTAACTT 


TTGAAAAATC 


CGCCCATTCA 


ACAAACGAAT 


ATGATGGGCT 


3780 


AGCAAATGAC 


CATCTTTCAT 


AACACCTCCA 


ATTTATTTCG 


ATATCGAAAT 


GAATAAAACA 


3840 


ATTGTAACAC 


TCATCGTTCT 


AACTGTCAAC 


TATTTCGATT 


TAGAAATAAT 


TTTTGATAAT 


3900 


TATCCACACC 


ACCATACTCC 


GGCTCAACTA 


ACTTTTAACG 


AGAGTTTCTA 


AACTCCTTCG 


3960 


TCCTCCAGTC 


TACAAAAGCC 


TTCCATTCGT 


ACTATCCTAT 


ATTTTATGAG 


GGGACACATT 


4020 


TTTCCTATCA 


GACCATTTAT 


TTTAAAGATA 


GAAGTAAATC 


ATAATTGCTT 


CCATCTGTTC 


4080 


TTTTATAGTA 


TATTGAAGTT 


AGACTAGAGC 


ACTGTATCTT 


CTAAAACATT 


GATAGAAAGC 


4 140 


GATTTGAATT 


TCCCAATCAA 


TTTGTTCGTA 


TTTATAGCAT 


TTCGAAACTG 


GAATAGGACA 


4200 


CCATGACTGC 


TAAAAGATTT 


CTATAAATTC 


ATTTAATTTC 


CTCAATCAAT 


TTGTTCATAT 


4260 


CTTATTTCAT 


TCCGCTATAA 


TTTCACCTTA 


CCCTATCTTT 


TTCGTAGCAC 


CCTTCAAACA 


4 320 


GCCTATCCCC 


TACCGTTTGA 


CGATTCCTCA 


CTTCGCTCCA 


CTTCCATTAC 


AGAAGTTTCT 


4380 


TCACTACTAT 


GGGCTCGGCT 


GACTTCTCAT 


GATTCCTTGT 


TACTACTATT 


TGAACGCTCA 


4440 


CGAGATAGAT 


CTTACAAAAA 


ATGCTTTGAT 


CCACAATGGA 


ATCAAAGCAT 


TTTAAAGAGT 


4500 


TCCTCATACA 


TAAGCGCAGA 


AGTCGCAGTT 


CCTCTGTACT 


TGGCTTCTTC 


TCTTTTGACA 


4560 


AAGCGAGCCA 


AGTTGAGCAA 


CTCAGGTGCT 


GGATGTTTGG 


GATTTAGGAG 


CAATTCACGA 


4620 


TTGACCAGGC 


CTGAGAGACG 


AACTGCCTGC 


AATTGCTCAT 


TTGTAGTAGG 


CAGTTTTTTA 


4680 


GTAGTCTCTA 


GGAGAGCAGC 


AACTAAATCT 


TCACTCAAAT 


CATGTCGAGC 


ATGATTGTAA 


4740 


AGATCTTTTA 


TAAGGCTTTC 


T AGGTTTGG T 


TCTACCATCC 




1 Inluul 1 


H oUU 


TAATAATGTT 


TAATCAAATC 


AACCGTTGAA 


CGATCCAATT 


TCTTCACCAA 


GGCTTGTAAG 


4860 


AAAGCTTGCG 


CTTCTAGGAA 


GTCATCCATT 


GCATAGAGGG 


TTTGGTGAGA 


ATGGATATAA 


4920 


CGAGCGCAGA 


CACCGATAGT 


TGTTGATGGG 


ACACCACCAT 


TTTTCAGATG 


AGCTGCACCT 


4980 


GCATCTGTTC 


CGCCTTTACC 


ACAGTAGTAT 


TGGTACTTGA 


TACCAGCTTC 


TTCAGCCGTT 


5040 


GTCAAAAGGA 


AATCCTTCAT 


CCCTGGGAGA 


AGCAAGTGAC 


CTGGATCATA 


GAAACGAATC 


5100 


AAGGTTCCAT 


CTCCAATCTT 


GCCTTGACCA 


CCGTAGACAT 


CACCTGCTGG 


TGAGCAATCA 


5160 


ACTGCGAGGA 


AGACTTCTGG 


GTCAAACTTG 


GTTGTAGAGG 


TATGAGCGCC 


ACGCAGACCA 


5220 


ACTTCTTCTT 


GGACGTTAGA 


ACCCAGATAG 


AGTTCATTGC 


CGAGTTTTTG 


ACCCGATAAA 


5280 
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GCTTCAGCTA GCTCGCTTAC CATGAGGACA CCGTAGCGGT TATCCCAAGC TTTTGAGATG 5340 

ATATTTTTTT CATTGGCTGT CAAAATTGCA GAACTATCTG GTACAATGGT ATCACCAGGA 5400 

CGGATGCCAA AACTTTCTGC CTCAGCCTTG TCCGCAAAAC CACCATCAAA AACGATATCG 5460 

GCAATGGCTG GCATGGTTGG TCCCCCCTTT CCACGAGTCA AATGCGGAGG AACAGAACCT 5520 

GAAATCACAG GAATTTCATG ACCATCACGA GTCAAGAGTT TGAAACGTTG GCTGCTAACC 5580 

ACCATGGGGT TCCAGCCACC GATTTCTACG ACACGGAAGG TACCATCTGG CTTGATTTCG 5640 

CTGACCATAA AACCAACTTC GTCCATATGA GAAGCGACCA AGACGCGCGG TGCATCCACA 5700 

GCTTCTGAAT GTTTGATACC AAAAATACCA CCCAAGCCAT CTGTCACCAC TTCATCCACA 5760 

TGCGGTGTCA ACTTTTCACG AAGATAAGCA CGGACAGGCG CTTCATGACC TGAGACTGCA 5820 

GCAAGTTCTG TTACTTCTTT AATTTTTGAA AATAATGTTG TCATTTCAGT TCCTTCTTTC 5880 

TTTCATCCAT TTTACCACTT TTTATAGGAG AAGGATAGTG GGAAGGTGGA TTTCTAAGTT 5940 

AGTATCTTAG TCCTGCTCTA TCTTAGAAAA GGATAGTATT CTCTTGCATG TAGTGCAAAA 6000 

TCTAGTAAAC ATTCCAAAAT TAACTCGAAT ATTTATTTCC AAACAAAAAA ACAATACACC 6060 

ATCAAAGTTG TTTGGATTTT TCATGAAATT TACAGAAAAT AGTTGACTTC CCTTTCTTCT 6120 

TTCTTTAAAT ATATAGTTGG TTGAGTTTGG AATAGTACGC TGTAGCTGCT AAAACATTTC 6180 

TAGAAATTAA TTTGACTTTC CTAATAGAGT TGTTCATATC TTATTTCAAT TTACTATAGT 6240 

ACAAAACTAG AAAAGGAAAA AATCATGACC AGG 6273 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28171 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

ACAACCTTTT TCAAAAACTC ACCTTGGTAC GGAGATGTTT TGCTTTCTGC TATTATTTTC 60 

GGTTATATTC ATATCAATTT TGCTTTAACT CCTCTTGCTT TTTTCATTTA TGCTAGTGGA 120 

GGTCTTATTT TAGCTCTATT GTATCGCATG ACTAAAAATC TCTACTATCC AATACTAGTT . 180 

CATATTCTCA TTAATATCAC TGCCTTCTGG GATGTGTGGT TGCTCCTATT TTCAGGAAGT 240 

TAGCTTACTA AAATAATGTC GGAACTTTCC GGCATTTTCT TTTTTCACAA ATAGTCAACG 300 

TTTTTCTTTT CGATATTGTA GTGGTGTGTA TCCAGTTATT TTTTTGAATT GATTTTGAAA 360 
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ATAAGGTTGA 


CTTGAGAAAG 


GCAGATAGTG 


AAGATAGTTA 


AGAAGAATAG 


GATGTTCTTT 


420 


TTTCCTTTTT 


GGAAAACTTC 


TAAAATATGG 


TATAATGAAA 


AGATAAAGAA 


GTTGGGGGTA 


480 


GAAGATGAAC 


ATTCAACAAT 


TACGCTATGT 


TGTGGCTATT 


GCCAATAGTG 


GTACTTTTCG 


540 


TGAAGCTGCT 


GAAAAGATGT 


ATGTTAGTCA 


GCCGAGTCTG 


TCTATTTCTG 


TTCGTGATTT 


600 


GGAAAAAGAG 


TTGGGCTTTA 


AGATTTTCCG 


TCGGACCAGC 


TCAGGGACTT 


TCTTGACCCG 


660 


TCGTGGGATG 


GAATTTTATG 


AAAAATCGCA 


AGAATTGGTT 


AAAGGATTTG 


ATATTTTTCA 


720 


AAATCAGTAT 


GCCAATCCTG 


AAGAAGAAAA 


AGATGAATTT 


TCTGTTGCTA 


GCCAGCACTA 


780 


TGACTTCTTG 


CCACCAACTA 


TTACGGCCTT 


TTCAGAGCGC 


TATCCTGACT 


ATAAGAACTT 


840 


CCGTATTTTT 


GAATCAACTA 


CTGTTCAAAT 


ATTAGATGAA 


GTGGCGCAAG 


GGCATAGTGA 


900 


GATTGGGATT 


ATCTACCTCA 


ACAATCAAAA 


TAAAAAGGGG 


ATTATGCAAC 


GGGTTGAAAA 


960 


ATTAGGTCTG 


GAGGTCATCG 


AATTGATTCC 


TTTCCATACC 


CATATTTATC 


TCCGTGAGGG 


1020 


TCATCCTTTA 


GCCCAGAAAG 


AGGAATTAGT 


CATGGAGGAT 


TTAGCGGATT 


TACCAACGGT 


1080 


TCGTTTCACT 


CAAGAGAAAG 


ACGAGTACCT 


TTATTATTCA 


GAGAACTTTG 


TCGATACCAG 


1140 


CGCTAGCTCA 


CAGATGTTTA 


ATGTGACAGA 


CCGTGCCACC 


TTGAATGGTA 


TTTTGGAGCG 


1200 


GACGGACGCC 


TATGCGACAG 


GTTCTGGATT 


TTTAGATAGT 


GACAGTGTTA 


ATGGCATTAC 


1260 


AGTTATTCGT 


CTCAAGGATA 


ACCTAGATAA 


CCGCATGGTC 


TATGTTAAAC 


GTGAAGAAGT 


1320 


GGAGCTTAGT 


CAAGCTGGGA 


CTCTCTTCGT 


AGAAGTCATG 


CAAGAATATT 


TTGATCAAAA 


1380 


GAGGAAATCA 


TGAAAAAAAG 


AGCAATAGTG 


GCAGTCATTG 


TACTGCTTTT 


GATTGGGCTG 


1440 


GATCAGTTGG 


TCAAATCCTA 


TATCGTCCAG 


CAGATTCCAC 


TGGGTGAAGT 


GCGCTCCTGG 


1500 


ATCCCCAATT 


TCGTTAGCTT 


GACCTACCTG 


CAAAATCGAG 


GTGCAGCCTT 


TTCTATCTTA 


1560 


CAAGATCAGC 


AGCTGTTATT 


CGCTGTCATT 


ACTCTGGTTG 


TCGTGATAGG 


TGCCATTTGG 


1620 


TATTTACATA 


AACACATGGA 


GGACTCATTC 


TGGATGGTCT 


TGGGTTTGAC 


TCTAATAATC 


1680 


GCGGGTGGTC 


TTGGAAACTT 


TATTGACAGG 


GTCAGTCAGG 


GCTTTGTTGT 


GGATATGTTC 


1740 


CACCTTGACT 


TTATCAACTT 


TGCAATTTTC 


AATGTGGCAG 


ATAGCTATCT 


GACGGTTGGA 


1800 


GTGATTATTT 


TATTGATTGC 


AATGCTAAAA 


GAGGAAATAA 


ATGGAAATTA 


AAATTGAAAC 


1860 


TGGTGGTCTG 


CGTTTGGATA 


AGGCTTTGTC 


AGATTTGTCA 


GAATTATCAC 


GTAGTCTCGC 


1920 


GAATGAACAA 


ATTAAATCAG 


GCCAGGTCTT 


GGTCAATGGT 


CAAGTCAAGA 


AAGCTAAATA 


1980 


CACAGTCCAA 


GAGGGTGATG 


TCGTCACTTA 


CCATGTGCCA 


GAACCAGAGG 


TATTAGAGTA 


2040 


TGTGGCTGAG 


GATCTTCCGC 


TAGAAATAGT 


CTACCAAGAT 


GAGGATGTGG 
t 


CTGTCGTTAA 


2100 


CAAACCTCAG 


GGAATGGTTG 


TGCACCCGAG 


TGCTGGTCAT 


ACCAGTGGAA 


CCCTAGTAAA 


2160 
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TGCCCTCATG TATCATATTA AGGACTTGTC GGGTATCAAT GGGGTTCTGC GTCCAGGGAT 2220 

TGTTCACCGT ATTGATAAGG ATACGTCAGG TCTTCTCATG ATTGCTAAAA ACGATGATGC 2280 

GCATCTAGCA CTTGCCCAAG AACTCAAGGA TAAAAAGTCT CTCCGCAAAT ATTGGGCGAT 2340 

TGTTCATGGA AATCTACCTA ATGATCGTGG TGTAATTGAA GCGCCGATTG GCCGGAGTGA 2400 

AAAAGACCGT AAGAAACAGG CTGTAACTGC TAAAGGGAAG CCTGCAGTGA CGCGTTTTCA 2460 

CGTCTTGGAA CGCTTTGGCG ATTATAGCTT AGTAGAGTTG CAACTGGAGA CAGGGCGCAC 2520 

TCATCAAATC CGTGTCCACA TGGCTTATAT CGGCCATCCA GTCGCTGGTG ATGAGGTCTA 2580 

TGGTCCTCGC AAGACTTTGA AAGGACATGG ACAATTTCTT CATGCCAAGA CTTTAGGTTT 2640 

TACTCATCCG AGAACAGGTA AGACCTTGGA ATTTAAAGCA GATATCCCAG AGATTTTTAA 2700 

GGAAACCTTG GAGAGATTGA GAAAGTAAGA ATGAAAAAGA AATTAACTAG TTTAGCACTT 2760 

GTAGGCGCTT TTTTAGGTTT GTCATGGTAT GGGAATGTTC AGGCTCAAGA AAGTTCAGGA 2820 

AATAAAATCC ACTTTATCAA TGTTCAAGAA GGTGGCAGTG ATGCGATTAT TCTTGAAAGC 2880 

AATGGACATT TTGCCATGGT GGATACAGGA GAAGATTATG ATTTCCCAGA TGGAAGTGAT 2 940 

TCTCGCTATC CATGGAGAGA AGGAATTGAA ACGTCTTATA AGCATGTTCT AACAGACCGT 3000 

GTCTTTCGTC GTTTGAAGGA ATTGGGTGTC CAAAAACTTG ATTTTATTTT GGTGACCCAT 3060 

ACCCACAGTG ATCATATTGG AAATGTTGAT GAATTACTGT CTACCTATCC AGTTGACCGA 3120 

GTCTATCTTA AGAAATATAG TGATAGTCGT ATTACTAATT CTGAACGTCT ATGGGATAAT 3180 

CTGTATGGCT ATGATAAGGT TTTACAGACT GCTGCAGAAA AAGGTGTTTC AGTTATTCAA 3240 

AATATCACAC AAGGGGATGC TCATTTTCAG TTTGGGGACA TGGATATTCA GCTCTATAAT 3300 

TATGAAAATG AAACTGATTC ATCGGGTGAA TTAAAGAAAA TTTGGGATGA CAATTCCAAT 3360 

TCCTTGATTA GCGTGGTGAA AGTCAATGGC AAGAAAATTT ACCTTGGGGG CGATTTAGAT 3420 

AATGTTCATG GAGCAGAAGA CAAGTATGGT CCTCTCATTG GAAAAGTTGA TTTGATGAAG 3480 

TTTAATCATC ACCATGATAC CAACAAATCA AATACCAAGG ATTTCATTAA AAATTTGAGT 3540 

CCGAGTTTGA TTGTTCAAAC TTCGGATAGT CTACCTTGGA AAAATGGTGT TGATAGTGAG 3600 

TATGTTAATT GGCTCAAAGA ACGAGGAATT GAGAGAATCA ACGCAGCCAG CAAAGACTAT 3660 

GATGCAACAG TTTTTGATAT TCGAAAAGAC GGTTTTGTCA ATATTTCAAC ATCCTACAAG - 3720 

CCGATTCCAA GTTTTCAAGC TGGTTGGCAT AAGAGTGCAT ATGGGAACTG GTGGTATCAA 3780 

GCGCCTGATT CTACAGGAGA GTATGCTGTC GGTTGGAATG AAATCGAAGG TGAATGGTAT 3840 

TACTTTAACC AAACGGGTAT CTTGTTACAG AATCAATGGA AAAAATGGAA CAATCATTGG 3900 
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TTCTATTTGA CAGACTCTGG 


TGCTTCTGCT 


AAAAATTGGA 


AGAAAATCGC 


TGGAATCTGG 


3960 


TATTATTTTA 


ACAAAGAAAA 


CCAGATGGAA 


ATTGGTTGGA 


TTCAAGATAA 


AGAGCAGTGG 


4020 


TATTATTTGG 


ATGTTGATGG 


TTCTATGAAG 


ACAGGATGGC 


TTCAATATAT 


GGGGCAATGG 


4080 


TATTACTTTG 


CTCCATCAGG 


GGAAATGAAA 


ATGGGCTGGG 


TAAAAGATAA AGAAACCTGG 


4140 


TACTATATGG 


ATTCTACTGG 


TGTCATGAAG 


ACAGGTGAGA 


TAGAAGTTGC 


TGGTCAACAT 


4200 


TATTATCTGG 


AAGATTCAGG 


AGCTATGAAG 


CAAGGCTGGC 


ATAAAAAGGC 


AAATGATTGG 


4260 


TATTTCTACA AGACAGACGG 


TTCACGAGCT 


GTGGGTTGGA 


TCAAGGACAA 


GGATAAATGG 


4320 


TACTTCTTGA 


AAGAAAATGG 


TCAATTACTT 


GTGAACGGTA 


AGACACCAGA 


AGGTTATACT 


4380 


GTGGATTCAA 


GTGGTGCCTG 


GTTAGTGGAT 


GTTTCGATCG 


AGAAATCTGC 


TACAATTAAA 


4440 


ACTACAAGTC 


ATTCAGAAAT 


AAAAGAATCC 


AAAGAAGTAG 


TGAAAAAGGA 


TCTTGAAAAT 


4500 


AAAGAAACGA 


GTCAACATGA 


AAGTGTTACA 


AATTTTTCAA 


CTAGTCAAGA 


TTTGACATCC 


4560 


TCAACTTCAC 


AAAGCTCTGA AACGAGTGTA AACAAATCGG 


AATCAGAACA GTAGTAGAAA 


4620 


AGAAGGTTTT 


AGGGCCTTCT 


TTTTCCTATC 


AACTCTTTTC 


TATTTCCTGT 


TATTCATGTT 


4680 


ATAATGGATA 


AATATGAATA 


ATCGGAGTGA 


GACTATGAAA 


TACAAACGGA 


TTGTCTTTAA 


4740 


GGTGGGTACT 


TCTTCTCTGA 


CAAATGAGGA 


TGGAAGTTTA 


TCACGTAGTA 


AGGTAAAGGA 


4800 


TATTACCCAG 


CAGTTGGCTA 


TGCTGCACGA 


GGCTGGTCAT 


GAGTTGATTT 


TGGTGTCTTC 


4860 


AGGTGCCATT 


GCGGCTGGTT 


TTGGAGCCTT 


AGGATTTAAA 


AAGCGTCCGA 


CTAAGATTGC 


4920 


TGATAAACAG 


GCTTCAGCAG 


CGGTAGGGCA 


GGGGCTTTTG 


TTGGAAGAAT 


ATACAACCAA 


4980 


TCTTCTCTTG 


CGTCAAATCG 


TTTCPGCACA 


AATCTTGCTG 


ACCCAAGATG 


ACTTTGTGGA 


5040 


TAAGCGTCGT 


TATAAAAATG 


CCCATCAGGC 


TTTGTCGGTT 


TTGCTCAACC 


GTGGGGCAAT 


5100 


TCCTATCATC AATGAGAATG 


ATAGTGTCGT 


TATTGATGAG 


CTCAAGGTTG 


GGGACAATGA 


5160 


CACTCTAAGT GCTCAAGTAG 


CGGCGATGGT 


CCAAGCAGAC 


CTTTTAGTTT 


TCTTGACAGA 


5220 


TGTGGACGGT 


CTCTATACTG 


GAAATCCTAA 


TTCAGATCCA 


AGAGCCAAAC 


GCTTGGAGAG 


5280 


AATCGAGACC 


ATCAATCGTG 


AGATTATTGA 


TATGGCTGGT 


GGAGCTGGTT 


CGTCAAACGG 


5340 


AACTGGGGGT ATGTTAACCA AAATCAAGGC 


TGCAACTATC 


GCGACGGAAT 


CAGGAGTTCC 


5400 


TGTTTATATC 


TGCTCATCCT 


TGAAATCAGA 


TTCCATGATT 


GAGGCGGCAG 


AGGAGACCGA 


5460 


GGATGGTTCT 


TACTTTGTTG 


CTCAAGAGAA 


GGGGCTTCGT 


ACCCAGAAAC 


AATGGCTTGC 


5520 


CTTCTATGCT 


CAGAGTCAAG 


GTTCTATTTG 


GGTTGATAAA 


GGGGCTGCGG 


AAGCTCTCTC 


5580 


TCAATATGGA AAGAGTCTTC 


TCTTATCTGG 


TATCGTTGAA 


GCAGAAGGAG 


TCTTTTCTTA 


5640 


CGGTGATATC 


GTGACAGTAT 


TTGACAAGGA AAGTGGAAAA TCACTTGGAA AAGGACGCGT 


5700 
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GCAATTTGGA 


GCATCTGCTT 


TGGAGGATAT 


GTTGCGTTCT 


CAAAAAGCCA 


AGGGTGTCTT 


5760 


GATTTACCGT 


GACGACTGGA 


TTTCCATTAC 


TCCTGAAATC 


CAACTACTTT 


TTACAGAATT 


5820 


TTAGAGGTAA 


ACTATGGTGA 


GTAGACAAGA 


ACAATTTGAA 


CAGGTACAGG 


CTGTTAAAAA 


5880 


ATCGATTAAC 


ACAGCTAGTG 


AAGAAGTGAA 


AAACCAAGCC 


TTGCTAGCCA 


TGGCTGATCA 


5940 


CTTAGTGGCT 


GCTACTGAGG 


AAATTTTAGC 


GGCTAATGCC 


CTCGATATGG 


CAGCGGCTAA 


6000 


GGGGAAAATC 


TCAGATGTGA 


TGTTGGATCG 


TCTTTATTTG 


GATGCAGATC 


GTATAGAAGC 


6060 


GATGGCAAGA 


GGAATTCGTG 


AAGTGGTTGC 


CTTACCAGAT 


CCAATCGGTG 


AAGTTTTAGA 


6120 


AACAAGTCAG 


CTTGAAAATG 


GTTTGGTTAT 


CACAAAAAAA 


CGTGTAGCTA 


TGGGTGTCAT 


6180 


CGGTATTATC 


TATGAAAGCC 


GTCCAAATGT 


GACGTCTGAT 


GCGGCTGCTT 


TGACTCTTAA 


6240 


GAGTGGAAAT 


GCGGTTGTTC 


TTCGTAGTGG 


TAAGGATGCC 


TATCAAACAA 


CCCATGCCAT 


6300 


TGTCACAGCC 


TTGAAGAAGG 


GCTTGGAGAC 


GACTACTATT 


CATCCAAATG 


TGATTCAACT 


6360 


GGTGGAGGAT 


ACTAGCCGTG 


AAAGTAGTTA 


TGCTATGATG 


AAGGCCAAGG 


GCTATCTAGA 


6420 


CCTTCTCATT 


CCTCGTGGAG 


GAGCTGGCTT 


GATCAATGCA 


GTGGTTGAGA 


ATGCGATTGT 


6480 


ACCTGTTATC 


GAGACAGGGA 


CTGGGATTGT 


CCATGTCTAT 


GTGGATAAGG 


ATGCAGACGA 


6540 


AGACAAGGCG 


CTGTCTATCA 


TCAACAATGC 


TAAAACCAGT 


CGTCCTTCTG 


TTTGTAATGC 


6600 


CATGGAGGTT 


CTGCTGGTTC 


ATGAAAACAA 


GGCAGCAAGC 


TTCCTTCCTC 


GCTTGGAGCA 


6660 


AGTGTTGGTT 


GCAGAGCGTA 


AGGAAGCTGG 


ACTGGAACCA 


ATTCAATTCC 


GCCTAGATAG 


6720 


CAAAGCAAGC 


CAGTTTGTTT 


CAGGTCAAGC 


AGCTGAGACC 


CAAGACTTTG 


ACACCGAGTT 


6780 


TTTAGACTAT 


GTCCTTGCTG 


TTAAGGTTGT 


GAGCAGTTTA 


GAAGAAGCGG 


TTGCGCACAT 


6840 


TGAATCCCAC 


AGCACCCATC 


ATTCGGATGC 


TATTGTGACG 


GAAAATGCTG 


AAGCTGCAGC 


6900 


ATACTTTACA 


GATCAAGTGG 


ACTCTGCAGC 


GGTGTATGTT 


AATGCCTCAA 


CTCGTTTCAC 


6960 


AGATGGAGGA 


CAATTTGGTC 


TTGGTTGTGA 


AATGGGGATT 


TCTACTCAGA AATTGCACGC 


7020 


GCGTGGTCCC 


ATGGGCTTGA 


AAGAGTTGAC 


CAGCTACAAG 


TATGTGGTTG 


CCGGTGATGG 


7080 


GCAGATAAGG 


GAGTAAGAGA 


TGAAGATTGG 


ATTTATCGGT 


TTGGGGAATA 


TGGGTGCTAG 


7140 


CTTGGCAAAA 


TCTGTCTTGC 


AGACTAGGAC 


GTCAGATGAG 


ATTCTCCTTG 


CCAATCGTAG 


7200 


TCAAGCTAAG 


GTAGATGCTT 


TCATTGCAGA 


CTTTGGTGGT 


CAGGCTTCCA 


GCAATGAAGA . 


7260 


AATGTTTGCA 


GAAGCAGATG 


TGATTTTTCT 


AGGAGTTAAG 


CCTGCTCAGT 


TTTCTGAACT 


7320 


GCTTTCTCAA 


TACCAGACCA 


TCCTTGAAAA 


AAGAGAAAGT 


CTTCTTTTGA 


TTTCGATGGC 


7380 


AGCTGGATTG 


ACCTTAGAAA 


AACTAGCAAG 


TCTTATCCCA 


AGTCAACACC 


GAATTATTCG 


7440 
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TATGATGCCT 


AATACCCCTG 


CTTCTATCGG 


GCAAGGAGTG 


ATTAGTTATG 


CCTTGTCTCC 


7500 


TAATTGCAGG 


GCTGAGGACA 


GTGAGCTCTT 


TTATCAGCTT 


TTAGCCAAGG 


CTGGTCTCTT 


7560 


GGTTGAACTA 


GGAGAAAGTT 


TAATCGATGC 


AGCGACAGGT 


CTTGCAGGTT 


GTGGACCAGC 


7620 


CTTTGTCTAT 


CTTTTTATCG 


AGGCCTTGGC 


AGATGCAGGT 


GTTCAGACAG 


GATTACCACG 


7680 


AGAAATAGCA 


TTGAAAATGG 


CAGCACAAAC 


TGTGGTAGGA 


GCTGGGCAAT 


TGGTCCTTGA 


7740 


AAGTCAGCAA 


CATCCTGGAG 


TATTGAAAGA 


CCAAGTCTGT 


AGCCCAGGCG 


GTTCGACTAT 


7800 


CGCTGGTGTA 


GCAAGCCTAG 


AAGCGCATGC 


TTTCCGAGGA 


ACAGTCATGG 


ATGCAGTTCA 


7860 


TCAAGCCTAC 


AAACGAACAC 


AAGAACTAGG 


TAAATAAGAG 


GTAGTTTTGA 


CTGCCTCTTT 


7920 


TATGGTGGCT 


GAAATGAGAA 


GACACAAAAA 


GATTGTCACA AACCCCTATT 


TTTTTGATAG 


7980 


AATAGAAGTA 


GTAAAAAAGA AATGAGTTAG 


ACATGTCAAA 


AGGATTTTTA 


GTCTCTCTTG 


8040 


AGGGACCAGA 


GGGAGCAGGC 


AAGACCAGTG 


TTTTAGAGGC 


TCTGCTACCA 


ATTTTAGAGG 


8100 


AAAAAGGAGT 


AGAGGTGTTG 


ACGACCCGTG 


AACCTGGCGG 


AGTCTTGATT 


GGGGAGAAGA 


8160 


TTCGGGAAGT 


GATTTTGGAT 


CCAAGTCATA 


CTCAGATGGA 


TGCTAAAACA 


GAGCTACTTC 


8220 


TCTATATTGC 


CAGTCGCAGA 


CAGCATTTGG 


TGGAAAAAGT 


TCTTCCAGCC 


CTTGAAGCTG 


8280 


GCAAGTTGGT 


CATCATGGAT 


CGTTTTATCG 


ATAGTTCTGT 


TGCCTATCAG 


GGATTTGGTC 


8340 


GTGGCTTAGA 


TATTGAAGCC 


ATTGACTGGC 


TCAATCAGTT 


TGCGACAGAT 


GGCCTCAAAC 


8400 


CCGATTTGAC 


ACTCTATTTT 


GACATCGAGG 


TGGAAGAAGG 


GCTGGCTCGT 


ATTGCTGCTA 


8460 


ATAGTGACCG 


CGAGGTTAAT 


CGTTTGGATT 


TGGAAGGGTT 


GGACTTGCAT 


AAAAAAGTTC 


8520 


GTCAAGGCTA 


CCTTTCTCTT 


CTGGATAAAG 


AGGGAAATCG 


CATTGTCAAG 


ATTGATGCTA 


8580 


GTCTCCCTTT 


GGAGCAAGTT 


GTGGAAACTA 


CCAAGGCTGT 


CTTGTTTGAC 


GGAATGGGCT 


8640 


TGGCCAAATG 


AAACAAGATC 


AACTAAAGGC 


TTGGCAACCA 


GCTCAGTTTG 


ACCGTTTTGT 


8700 


CCGTATCTTA 


GAACAAGACC 


AGCTCAATCA 


CGCCTATCTC 


TTTTCAGGTT 


T CTTTG AAAG 


8760 


CTTGGAAATG 


GCGCAATTTT 


TAGCTAAGAG 


CCTCTTTTGT 


ACGGATAAAG 


TTGGCGTCTT 


8820 


ACCATGTGAG 


AAATGCCGAA 


GTTGCAAGCT 


GATTGAACAG 


GGAGAATTTC 


CCGATGTCAC 


8880 


CTTGATTAAA 


CCAGTTAATC 


AGGTCATTAA 


GACGGAACGC 


ATTCGAGAAT 


TGGTGGGTCA 


8940 


GTTTTCTCAA 


GCAGGGATTG 


AAAGCCAGCA 


ACAGGTCTTT 


ATCATCGAGC 


AAGCGGATAA 


9000 


AATGCATCCC 


AACGCAGCCA 


ATTCTCTGCT 


CAAGGTCATC 


GAAGAACCCC 


AGAGTGAAGT 


9060 


TTATATTTTC 


TTCTTGACTA 


GCGATGAGGA 


AAAGATGTTA 


CCGACAATCC 


GAAGTCGGAC 


9120 


TCAGATCTTC 


CACTTTAAAA 


AGCAAGAAGA 


AAAACTTATC 


TTACTCTTAG 


AACAAATGGG 


9180 


ACTTGTTAAG 


AAAAAAGCGA 


CTCTTTTAGC 


TAAGTTTAGT 


CAATCGCGAG 


CTGAAGCAGA 


9240 
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AAAGTTGGCT AATCAGGCAA GTTTTTGGAC CTTGGTCGAT GAAAGTGAAC GCCTGCTGAC 9300 

TTGGTTAGTA GCTAAGAAAA AAGAAAGTTA TCTACAGGTT GCCAAATTAG CCAACTTGGC 9360 

AGATGATAAG GAAAAACAGG ATCAGGTTTT ACGGATTCTT GAAGTTCTCT GTGGGCAGGA 9420 

CCTCTTGCAG GTAAGAGTAA GAGTGATTCT ACAAGATTTA CTAGAAGCTA GAAAAATGTG 9480 

GCAAGCTAAT GTCAGCTTTC AAAATGCCAT GGAATATCTG GTCTTGAAAG AAATATAAAC 9540 

TCAAAAATGA ATGATAAAGA AAGGAAAGGG CTGTTTTATG GACAAAAAAG AATTATTTGA 9600 

CGCGCTGGAT GATTTTTCCC AACAATTATT GGTAACCTTA GCCGATGTGG AAGCCATCAA 9660 

GAAAAATCTC AAGAGCCTGG TAGAGGAAAA TACAGCTCTT CGCTTGGAAA ATAGTAAGTT 9720 

GCGAGAACGC TTGGGTGAGG TGGAAGCAGA TGCTCCTGTC AAGGCCAAGC ATGTTCGTGA 9780 

AAGTGTCCGT CGCATTTACC GTGATGGATT TCACGTATGT AATGATTTTT ATGGACAACG 9840 

TCGAGAGCAG GACGAGGAAT GTATGTTTTG TGACGAGTTG CTATACAGGG AGTAGGCATG 9900 

CAGATTCAAA AAAGTTTTAA GGGGCAGTCT CCCTATGGCA AGCTGTATCT AGTGGCAACG 9960 

CCGATTGGCA ATCTAGATGA TATGACTTTT CGTGCTATCC AGACCTTGAA AGAAGTGGAC 10020 

TGGATTGCTG CTGAGGATAC GCGCAATACA GGGCTTTTGC TCAAGCATTT TGACATTTCC 10080 

ACCAAGCAGA TCAGTTTTCA TGAGCACAAT GCCAAGGAAA AAATTCCTGA TTTGATTGGT 10140 

TTCTTGAAAG CAGGGCAAAG TATTGCTCAG GTCTCTGATG CCGGTTTGCC TAGCATTTCA 10200 

GACCCTGGTC ATGATTTAGT TAAGGCAGCT ATTGAGGAAG AAATTGCAGT TGTGACAGTT 10260 

CCAGGTGCCT CTGCAGGAAT TTCTGCCTTG ATTGCCAGTG GTTTAGCGCC ACAGCCACAT 10320 

ATCTTTTACG GTTTTTTACC GAGAAAATCA GGTCAGCAGA AGCAATTTTT TGGCTTGAAA 10380 

AAAGATTATC CTGAAACACA GATTTTTTAT GAATCACCTC ATCGTGTAGC AGACACGTTG 10440 

GAAAATATGT TAGAAGTCTA CGGTGACCGC TCCGTTGTCT TGGTCAGGGA ATTGACCAAA 10500 

ATCTATGAAG AATACCAACG AGGTACTATC TCTGAGTTAT TAGAAAGCAT TGCTGAAACG 10560 

CCACTCAAGG GCGAATGTCT TCTCATTGTT GAGGGTGCCA GTCAGGGTGT GGAGGAAAAG 10620 

GACGAGGAAG ACTTGTTCGT AGAAATTCAA ACCCGCATCC AGCAAGGTGT GAAGAAAAAC 10680 

CAAGCTATCA AGGAAGTCGC TAAGATTTAC CAGTGGAATA AAAGTCAGCT CTACGCTGCC 10740 

TACCACGACT GGGAAGAAAA ACAATAAAGG GAGACAGGAT GTAATAATTC TGTCTGTTTC, 10800 

TGTTTAACTT AATTAGTGAT GATAATATAA AGATGTATCA CTTGGTATAG AAGCTTTGGT 10860 

ATTAAGTTTT TTATTAAGCC CATACGGAAT ACCGATGGTT GGAGCAGCAG TTATAGCGTT 10920 

CTTAGAAGGT ATAAATAGAA AAATAAGGTC ATTTTAAATC AAAGGATTGA TAAATCAGAA 10980 
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AGAAGGTGAT TTTTTGCGAA CATACGAAAA TAAAGAAGAA CTAAAAGCTG AGATAGAGAA 11040 

AACATTTGAG AAATATATTT TAGAATTTGA TAATATTCCA GAAAATTTAA AAGATAAGAG 11100 

AGCTGATGAA GTTGACAGAA CTCCAGCAGA AAACCTTGCT TATCAGGTTG GTTGGACCAA 11160 

CTTGGTTCTT AAATGGGAAG AAGATGAAAG AAAGGGGCTT CAAGTAAAAA CACCATCGGA 11220 

TAAATTTAAA TGGAATCAAC TTGGTGAATT ATATCAGTGG TTCACAGATA CCTACGCTCA 11280 

TTTATCTCTG CAAGAGTTGA AAGCAAAATT AAATGAAAAT ATTAATTCTA TCTCTGCAAT 1134 0 

GATTGATTCG TTGAGTGAGG AAGAATTATT TGAACCGCAT ATGAGAAAGT GGGCTGATGA 11400 

AGCGACTAAA ACAGCGACTT GGGAAGTGTA TAAGTTTATT CATGTAAATA CGGTTGCACC 11460 

TTTTGGAACT TTCAGAACTA AAATCAGAAA ATGGAAGAAG ATAGTATTAT AAATTATATT 11520 

TTTAACTTTA AAAAATTTCA TAAAAATGGT TACCAAAGGC GATAGAAGAA AAACTATCGT 11580 

CTTTTTCTTT GCAAATTTTT AAGAAGGGAG GTGATCTTGC ATGGACTTTG AATATTTTTA 11640 

TAACAGAGAA GCGGAAAGAT TTAACTTCTT AAAAGTACCG GAGATATTAG TTGATAGAGA 11700 

AGAATTTCGG GGCTTATCAG CAGAAGCAAT TATCCTTTAT TCCATACTTC TTAAACAGAC 11760 

AGGAATGTCA TTTAAGAATA ACTGGATAGA CAAGGAAGGC AGAGTATTTA TCTATTTTAC 11820 

TGTCGAAGAA ATTATGAAAA GAAGAAATAT CTCAAAGCCA ACTGCCATAA AAACATTAGA 11880 

TGAGCTTGAT GTAAAAAAGG AATAGGACTG ATCGAAAGAG TAAGGCTTGG ACTTGGTAAG 11940 

CCGAACATCA TTTATGTTAA AGACTTTATG AGTATATTTC AGGTAAAAGA AAATGACTTA 12000 

CAGAAGTCAA AAAACTTAAC TTCAGAAGTA AAAGATTTTA ACCTCAGAAG TAAAGAAAAT 12060 

GAACTTCAAG AGGTTAAGAA CCTTGACTCT AACTATATAG AGAATAATAA GAGTAAGTAT 12120 

AGTAAGAGAG AATATAGTTT TGGTGAAAAC GGACTTGGAA CATTTCAAAA TGTGTTTTTA 121 BO 

GCTGCTGAAG ATATATCGGA TTTACAAATC ATAATGAACT CACAGCTTGA GAATTACATT 12240 

AGACTTCCTG CAAAACTAGA ATCCTAGTTC ATGATTGATA ATGCCAGCAA TCAAATTCAT 12300 

TCGTAATCCG AAGCGTTTAC GATGATTTCG ATAGATTGTT GAAAACATTT TAAACGTTTT 12360 

TACTTTGGCA AAGATGTTCT CAATCTTGCT TCTCTCCTTG GATAGCGCAT GGTTACAGGC 12420 

TTTATCTTCA GCTGTTAGCG GCTTGAGTTT GCTGGATTTA CGTGGAGTTT GTACTTGAGG 12480 

ATATATCTTC ATGAGCCCTT GATAACCACT GTCAGACAAG ATTTTACCAG CTTGTCCGAT 12540 

ATTTCTGCGA CTCATTTTGA ACAACTTCAT ATCACGACAA TAGTTCACAG CGATATCCAA 12600 

AGAAACAATT CTCCCTTGAC TTGTGACAAT CGCTTGAGCC TTCATAGCGT GAAATTTCTT 12660 

TTTACCAGAA TGATTCGCTA ATTCTTTTTT TAGGGCGATT GATTTTTACT TCCGTCGCAT 12720 

CAATCATTAC CGTGTCCTCA GAACTGAGAG GAGTTCTTGA AATCGTAACA CCACTTTGAA 12780 
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CAAGAGTTAC 


TTCAACCCAT 


TGGCTCCGAC 


GGATTAAGTT 


GCTTTCGTGA 


ATACCAAAAT 


12840 


CAGCCGCAAT 


TTGTTCATAA 


GTTCGATATT 


CTCGCACATA 


TTGAAGAGTG 


GCCATAAGAA 


12900 


GGTCTTCTAG 


GCTTAATTTA 


GGTTTTCGTC 


CACCTTTTGC 


GTGTTTAAGT 


TGATAAGCTG 


12960 


TTTTTAATAC 


AGCTAATATC 


TCTTCAAAAG 


TCGTGCGCTG 


AACACCAACA 


AGACGCTTAA 


13020 


ATCGTGCATC 


AGTTAGTTGT 


TTACTTGCTT 


CATCATTCAT 


AGAACTACTA 


TACCATATTT 


13080 


TGTTTCGCAG 


GAAGTCTATT 


GGAAAGTAAG 


AAATATTGAA 


GCTGAGGCTA 


TTAGAAGAAA 


13140 


TTGTGAGCGT 


GGTGCTATTT 


TTTCAGGTAA 


AATAAAATAT 


CACGAAGATT 


CACAGTTTAA 


13200 


AGGAGATCAC 


TATGTTGAAT 


GTTATGCTGT 


TTTAGATAAT 


ACGGTTATAG 


CAAGAGATAG 


13260 


AATAACAGTC 


CCTATCGATC 


CGTTATGTGG 


AAAAGATTTT ATAGAGTAGC 


ATATAATTGA 


13320 


TTCTTAACTG 


GAATACTCAC 


TATCTCTTTA 


CATCAAGAAA 


ATGACTAAAC 


AGGGAAGTTT 


13380 


GCCTTCTTCC 


CTTTTTTTGT 


TATACTAGTA 


uMuAAAAAA 


mm n.r*K x x x m 

TTAGAAAGAT 


TTGTGGGTGT 


13440 


CAAACAGCCC 


AGTGGGGTGT 


TTTAATATGG 


ACTTAGGTCC 


CACCCAAAGA 


GGTATTAGTG 


13500 


TCGTGTCTCA 


ATCTTATATC 


AATGTTATCG 


GTGCTGGTTT 


GGCAGGTTCT 


GAAGCAGCTT 


13560 


ACCAAATCGC 


AGAGCGTGGT 


ATTCCAGTTA 


AACTATATGA 


AATGCGTGGT 


GTCAAGTCTA 


13620 


CACCCCAGCA 


TAAAACAGAC 


AATTTTGCTG 


AGTTGGTTTG 


TTCCAATTCT 


TTGCGTGGGG 


13680 


ATGCTTTGAC 


AAATGCAGTT 


GGTCTTCTCA 


AGGAAGAAAT 


GCGTCGCTTG 


GGTTCTGTTA 


13740 


TCTTGGAATC 


TGCTGAGGCT 


ACACGTGTTC 


CTGCAGGTGG 


TGCCCTTGCA 


GTGGACCGTG 


13800 


ATGGTTTCTC 


TCAAATGGTG 


ACCGAAAAAG 


TTGCCAACCA 


CCCCTTGATT 


GAAGTGGTTC 


13860 


GTGATGAAAT 


TACAGAATTG 


CCGACAGATG 


TTATTACGGT 


TATCGCTACT 


GGTCCTTTGA 


13920 


CAAGTGATGC 


CTTGGCTGAA 


AAGATTCATG 


CTCTTAATGA 


CGGTGCTGGT 


TTTTATTTCT 


13980 


ACGATGCGGC 


AGCGCCTATT 


ATCGATGTCA 


ACACTATCGA 


TATGAGCAAG 


GTCTACCTCA 


14040 


AATCACGTTA 


TGATAAGGGA 


GAAGCGGCCT 


ACCTCAATGC 


CCCTATGACC 


AAGCAAGAVT 


14100 


TTATGGATTT 


CCATGAAGCT 


TTGGTCAATG 


CAGAAGAAGC 


ACCGCTTAGT 


TCTTTTGAAA 


14160 


AAGAAAAGTA 


CTTTGAAGGA 


TGTATGCCTA 


TCGAAGTCAT 


GGCCAAACGT 


GGCATTAAAA 


14220 


CTATGCTTTA 


TGGCCCTATG 


AAGCCAGTCG 


GTCTTGAGTA 


CCCAGACGAC 


TATACAGGAC 


14280 


CTCGTGATGG AGAATTTAAA 


ACACCTTATG 


CGGTTGTGCA 


ACTTCGTCAG 


GATAATGCAG . 


14340 


CTGGTAGCCT 


CTACAATATT 


GTTGGTTTCC 


AGACCCACCT 


CAAATGGGGA 


GAACAAAAGC 


14400 


GTGTCTTCCA 


AATGATTCCG 


GGTCTTGAAA 


ATGCGGAGTT 


TGTCCGTTAT 


GGTGTGATGC 


14460 


ATCGCAATTC 


TTACATGGAT 


TCACCAAATC 


TTCTTGAGCA 


GACTTACCGT 


TCTAAGAAAC 


14520 
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AACCAAATCT 
CTTCAGGCTT 
TTTTCCCCGA 
AACATTTCCA 
TCCGTGATAA 
AATTTTTGAC 
ATTGTGATAA 
ACGTATTTTA 
TATCCAAACA 
AATTGCCCTT 
TATGGACCGT 
GATGGCAGAT 
GCAACAAGTG 
TATCGTTATC 
CCTTCGTGCA 
TGTTTACAAT 
CCGTGACGTT 
GGACAACGAC 
CGTAT1TGGT 
AAGAATATGG 
TCACTTGCTC 
CGTGTACATG 
ATTCCAGAAG 
GAACGTGCCT 
CGCTTGGTTA 
AAGGTCGGCG 
GCTAAGAAAC 
GACATTCAAA 
GAGAAAGAAC 
AGTTTTATTC 



CTTCTTTGCT 
AGTTGCGGGA 
GACGACAGCG 
ACCAATGAAT 
GAAGGCTCGT 
TGTCTAATTT 
AATAGGTAGG 
ATCAAGTTAT 
GTTCAAACAA 
GTTATCGGTG 
GTTCAGGCAG 
TCATTGCAAC 
GCAGAGCCTT 
TTTGGTGCTG 
GCTGAAATCG 
GCCGATCCTA 
ATCAATAAAG 
ATTGACTTGG 
GAAAATATCG 
CTAACGCAAT 
GTGAATTTGG 
TAGAATACTA 
CGCGTGTTTT 
TGAACGCTTC 
TCCCAGCTCT 
AAAATGCTAA 
GAGAAAAAGC 
AAGTAACAGA 
TTTTGGAAGT 
GAAAGAAGGA 



GGTCAAATGA 
ATTAACGCAG 
ATTGGAAGCT 
GTCAATTTTG 
TATGAAAAAA 
TTTTGAAAGA 
ATGAAAGAAG 
CAGGTGAAGC 
TCGCAAAAGA 
GAGGAAATCT 
ATTACACAGG 
AAGTTGGGGT 
ATGTCCGTGG 
GAATTGGTTC 
AAGCAGATGC 
AGAAAGATAA 
GTCTTCGTAT 
TTGTATTCAA 
GAACAACAGT 
TATTGAAAAA 
TGGTATCCGT 
TGGAGTCGAA 
GTTGGTAACA 
TGATATTGGT 
TACAGAAGAA 
AGTGGCTGTC 
AAAAGAAATC 
CGATGCTGTT 
CTAAAAATAA 
AATATGAATA 



282 
CGGGTGTGGA 



CTCGTCTCTT 
TAGCTCATTA 
GGATCATCAA 
TTGCAGAGCG 
ATTGCTCATG 
GAGAGTGAAA 
CCTTGCCGGT 
GATTCAAGAA 
CTGGCGTGGA 
AATGCTTGGG 
TGATACGCGT 
ACGTGCCCTT 
ACCTTACTTC 
CATCCTCATG 
GACAGCTGTT 
CATGGACTCA 
CATGAACCAA 
TTCAAATAAT 
GCTAAAGAGA 
GCTGGTCGTG 
ACTCCTCTTA 
CCATTTGACA 
ATCACACCGG 
ACTCGTCGTG 
CGCAATATCC 
ACTGAAGACG 
AAACACATCG 
ACAGAAAAAC 
CAAATCTTGC 



AGGCTATGTT 
CAAGGAAGAA 
CATTACCCAT 
GGAGTTGGAA 
TGCCCTTGCC 
ATACTATAAA 
ATGGCGAATC 
GAACGTGGCG 
GTTCATAGCT 
GAACCTGCAG 
ACTGTTATGA 
GTACAAACAG 
CGTCACCTTG 
TCGACAGATA 
GCTAAAAATG 
AAGTTTGAAG 
ACAGCTTCAA 
CCAGGCAACA 
ATCGAAGAAA 
GAATGACCCA 
CCAATGCAAG 
ACCAAATCGC 
AGTCTTCATT 
CTAATGACGG 
ACCTTGCTAA 
GTCGCGATGC 
AATTGAAGAC 
ACGACATGAC 
TCAGTTGGCA 
AAGTTTTATC 



GAGTCGGCGG 
AGCGAGGCTA 
GCCGACAGCA 
GGCGAGCGTA 
GACTTAGAGG 
AATCTTAGAA 
CCAAGTATAA 
TAGGGATTGA 
TAGGTATCGA 
CAGAAGCAGG 
ATGCTCTTGT 
CTATTGCCAT 
AAAAAGGCCG 
CAACAGCGGC 
GTGTCGATGG 
AATTGACCCA 
CCCTCTCAAT 
TCAAACGTGT 
AGGAATAAGA 
GTCTCACCAA 
CTTGCTTGAC 
TTCAATTACG 
GAAAGACATC 
TTCPGTGATT 
AGAAGTGAAG 
TATGGACGAA 
TCTTGAAAAA 
TGCTAACAAA 
TTGCTGGCTG 
GTTGGACTGA 



14580 
14640 
14700 
14760 
14820 
14880 
14940 
15000 
15060 
15120 
15180 
15240 
15300 
15360 
15420 
15480 
15540 
15600 
15660 
15720 
15780 
15840 
15900 
15960 
16020 
16080 
16140 
16200 
16260 
16320 
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TCATCGATGA AAACGACCGT 


TTTFACTTTG 


TGCAAAAGGA 


TGGTCAAACC 


TATGCTCTTG 


16380 


CTAAGGAAGA AGGCCAACAT ACAGTAGGGG 


ATACGGTCAA 


AGGTTTTGCA 


TACACGGATA 


16440 


TGAAGCAAAA 


ACTCCGCCTG 




AAGTGACTGC 


CACTCAGGAC 


CAATTTGGTT 


16500 


GGGGACGTGT 


CACAGAGGTT 


r>prp» ftpp A/TP 


TGGGTGTCTT 


TGTGGATACA 


GGCCTTCCTG 


16560 


ACAAGGAAAT 


CGTTGTGTCA 


f**Pf P IT a rprp/-i 


TCCCTGAGCT 


CAAGGAACTC 


TGGCCTAAGA 


16620 


AGGGCGACCA ACTCTACATC 


CGTCTTGAAG 


TGGATAAGAA AGACCGTATC 


TGGGGCCTCT 


16680 


TGGCTTATCA 


AGAAGACTTC 


CAACGTC1 1\S 


CTCGTCCTGC 


CTACAACAAC 


ATGCAGAACC 


16740 


AAAACTGGCC 


AGCCATTGTT 


X ALLu ill CA 


AGCTGTCAGG 


AACTTTTGTT 


TACCTACCAG 


16800 


AAAATAATAT 


GCTTGGTTTT 


ATT CATC CTA 


GCGAGCGTTA 


CGCAGAGCCA 


CGTTTGGGGC 


16860 


AAGTATTAGA 


TGCGCGCGTT 


ATTGGTTTCC 


GTGAAGTGGA CCGCACTCTG 


AACCTCTCCC 


16920 


TCAAACCACG 


CTCCTTTGAA 


& rn/^rpm^/^ Ti J\ TV 


ACGATGCTCA GATGATTTTG 


ACTTATTTGG 


16980 


AAAGCAATGG 


CGGTTTCATG 


ACC 1 i AAA 1 (j 


ACAAGTCATC 


TCCAGACGAC 


ATCAAGGCAA 


17040 


CCTTTGGCAT 


TTCTAAAGGT 


C Av» X i\. AAtiA 


AAGCTTTAGG 


TGGTCTTATG 


AAGGCTGGTA 


17100 


AAATCAAGCA GGACCAGTTT 


GGGACAGAGT 


TGATTTAGGG 


AGGCTTATGA 


GAAAATCATT 


17160 


TTACACTTGG 


CTCATGACCG 


AGCGCAATCC 


TAAAAGTAAC 


AGTCCCAAAG 


CAATTTTGGC 


17220 


AGACCTCGCT 


TTTGAAGAGT 


CAGCCTTTCC 


AAAACACACA 


GATGATTTTG 


ATGAGGTCAG 


17280 


TCGCTTTTTG 


GAGGAGCATG 


CCAGTTTCTC 


TTTTAACCTA 


GGAGATTTTG 


ACAGCATTTG 


17340 


GCAGGAATAT 


CTAGAACACT 


AGCATTTATT 


CATTGGGTTT 


GGGCTAGTAA 


TTTCTCCATC 


17400 


CCTCTGCTAT 


AATAAAAAGA 


AATAAAAGGA 


TTAGAGAGGT 


TCTTTATTTG 


AAGGAACATT 


17460 


CAATAGACAT 


TCAACTGAGT 


CATCCAGATG 


ACCTGTTTCA 


TCTTTTTGGT 


TCCAATGAAC 


17520 


GCCATCTTCG 


TTTGATGGAA 


GAAGAGCTTG 


ATGTTGTGAT 


TCATGCTCGT 


ACGGAGATTG 


17580 


TCCAGGTTTT 


GGGAGAAGAG 


TCTGCCTGTG 


AGGAAGCCCG 


TCAAGTTATT 


CAGGCTTTGA 


17640 


TGGTCTTGGT 


AAATCGTGGG 


ATGACCGTTG 


GTACGCCAGA 


TGTAGTCACT 


GCGATTAGCA 


17700 


TGGTCAAAAA 


TGATGAAATT GACAAGTTTG TCGCCCTTTA CGAAGAAGAA ATTATCAAGG 


17760 


ATAATACTGG 


GAAACCTATC 


CGTGTCAAAA 


CCCTAGGGCA AAAGCTTTAT 


GTGGACAGTG 


17820 


TCAAACAGCA 


TGATGTGACC 


TTTGGAATTG 


GGCCAGCAGG 


TACAGGGAAG 


ACCTTCCTTG . 


17880 


CAGTGACCTT 


GGCAGTGACT 


GCCCTTAAAC 


GTGGGCAAGT 


CAAGCGAATT 


ATCCTAACTC 


17940 


GTCCAGCGGT 


GGAAGCGGGA 


GAGAGTCTTG 


GATTTCTTCC 


GGGTGATCTT 


AAGGAGAAGG 


18000 


TGGATCCTTA 


CCTTCGTCCT 


GTTTACGATG 


CCTTGTATCA 


AATTCTTGGG 


AAAGACCAAA 


18060 
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CGACTCGTCT 


CATGGAGCGT 


GAAATTATCG 


AAATTGCGCC 


CCTTGCCTAT 


ATGCGTGGCC 


18120 


GGACCTTGGA 


TGATGCCTTT 


GTCATTCTCG 


ATGAGGCGCA 


AAACACGACC 


ATCATGCAGA 


18180 


TGAAGATGTT 


CTTGACGCGT 


TTAGGTTTTC 


ATTCTAAGAT 


GATTGTCAAT 


GGAGATATTA 


18240 


GTCAGATTGA 


CCTGCCACGT 


AATGTCAAGT 


CCGGTTTGAT 


TGATGCTCAA 


GAGAAACTCA 


18300 


AGAACATCCA 


TCAGATTGAC 


TTTGTTCATT 


TTTCAGCCAA 


GGATGTGGTT 


CGCCATCCTG 


18360 


TTGTCGCTCA 


GATTATCCGA 


GCCTATGAAT 


ATTCTACTGA 


AGTTGCACAC 


GACTGATTTT 


18420 


GAGGAAGTTC 


GCCTGCAAAA 


GAATAGACTT 


GTTCGGTAAC 


TGTAAAAAGT 


GTTATACTAT 


18480 


TTTTATGGAA 


ACAGTATACG 


ACAAAGCACA 


AAAACTTAAC 


TCAAAAAACT 


TCAAACTATT 


18540 


GATTGGTGTC 


AAAAAGGAAA 


CCTTTCAACT 


CATGCTAGAA 


CACCTGAATT 


CAGCCTATCA 


18600 


GATTCAGCAC 


CGAAAAGGTG 


GACGTCCACG 


TAGTCTGCCC 


ATGGAAGACC 


AGCTCATTAT 


18660 


GACCCTCCGT 


TACTTGCGAT 


ATTATCCCAC 


TCAGCGTCTG 


CTGGCCTTTG 


ATTTTGGCGT 


18720 


CGGTGTAGCT 


ACGGTAAATG 


CCATCATCAC 


TTGGGTGGAG 


GATACACTTC 


GTGCGTCAGG 


18780 


TAGCTTTGAT 


TTGGACCATT 


TAGAAGCCCC 


GAGTGCTGCT 


GTGGCTATTG 


ACGTGACCGA 


18840 


AAGTCCGATT 


CAGCGTCCAA 


ACAAAACCAA 


AGCAAAAATT 


ATTCTGGTAA 


AAAGAAACGA 


18900 


CACACCTTAA 


AAACTCAAAT 


TATGCTGGAT 


TTGACGACAC 


ATAAAGTCTG 


TCAAATGGCC 


18960 


TTTTCTGACG 


GACATACGCA 


TGATTTTACT 


CTCTTCAAAG 


AAAGTATTGG 


ACAAAGTTTG 


19020 


CCTGAAACGA 


CGCTTGCCTT 


TGTTGACCTA 


GGTTATTTAG 


GCATCTTGAA 


ATTTCATGAG 


19080 


AATACTTTCA 


TTCCTGCTAA 


AAATTCCAAA 


AATCGCCGCC 


TGAGTGAGGA 


TGATAAGCAG 


19140 


TTAAATAAAG 


AGATGTCAGC 


GATACGAATT 


GAAATTGAAC 


ATTTTAACGC 


TAAATTCAAG 


19200 


ACCTTCCAAA 


TCATGTCAGT 


CCCTTATCGT 


AACCGCAGAA 


AACGTTTCGA 


GTTACGGGCG 


19260 


GAATTAATTT 


GTGCCATCAT 


CAATTATGAA 


GTGAACTAGA 


TTCCGAACAA 


GTCTAATATA 


19320 


CTTTTGAGAG 


AGGAAAATCC 


AGTTGTATAG 


GCTAAAGGTT 


TTATCCAAAG 


GTCTGAGAOA 


19380 


ACGATTAGGC 


ACGATGGAAA 


GAACTTTTAT 


GTGGCTGATG 


ACGATCAGTG 


CATCTTCCTG 


19440 


TGTCATAATC 


ACAGGGCACA 


AGAAAGTAGG 


AATTTGAAAA 


GATGATTGAC 


CAACTATCTA 


19500 


AGTATTACAG 


TTGTAGGATA 


CTAACTGAAA 


AGGATATTCC 


AAGTATTTTA 


TCTTTATATG 


19560 


AAAGTAATCC 


TCTGTATTTT 


CAGCATTGTC 


CACCAGAGCC 


AAATTTTGCA 


ACTGTAAAAG 


19620 


AGGACATGCT 


TTGTCTACCT 


GAAGGTAAAG 


CTAAGGCTGA 


TAAGTTTTTT 


GTTGGATTTT 


19680 


GGAATGGATC 


TGACCTTGTG 


GCTGTTATGG 


ATTTTGTCTA 


TGCATATCCT 


GATGAGGAGA 


19740 


CTGTTTTTAT 


TGGTTTGTTT 


ATGGTTGATC 


AAGCCTATCA 


GAGAAAAGGG 


ATTGGTAGTC 


19800 


ATATTGTGAC 


AGAAGCACTA 


GCTTATTTTG 


CTAAGAACTT 


TCGAAAGGCA 


CGTTTGGCTT 


19860 
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ATGTTAAGGG AAATCCGCAA TCTCAGCATT TTTGGGAAAA GCAGGGCTTT AAATCAATTG 19920 

GATGCGAGGT TAAGCAAGAA CTCTATACGG TTGTTATCGC TGAACAGAGC CTAGAAGATT 19980 

AGAAATGGCA TCAAGTAAGA ACTATTTGGA ATTTGTTTTG GAACAATTAT CAGGATTAGA 20040 

TGATGTGACT TACCGTTCCA TGATGGGGGA GTATATTCTT TACTTCCGCG GCAAGATTAT 20100 

TGGCGGCATT TATGACGATC GCTTTTTAGT TAAACCCGTG CAAGCAGTCT TAGATAAGAT 20160 

TGACCAATCT TCTTTTGAGT TTCCATACAA AGGTGCCAAA GAAATGATTT GAGTGGAAGA 20220 

ACTTGATAAT AAGATGTTTC TATAAGACCT AATTTTAGCT ATGTATAACC AACTGCCAAC 20280 

GCCCAAACCT AAAAAGAAAA AGCAAGGGTG AACGAAGTAA AAAAGAAGTC TGCTAAGGCC 20340 

CTGTCTTTGC ACGGGTAAAA TTTTATATAT AAAAAGAAGC TGGGACTAAA GAGCTCAGCT 20400 

TCCTTTGGTT TATATAATTG TCATTACAAG ACGAAGTGGT TGGGCGAAAC TCTGTTGACT 20460 

TTATTCAATT TAGAGTTTCT TATGCACAAT TGAGTCTGGA ACGAAAGTCT CCAGTTGCAA 20520 

AGTATACAGT ACAATAAACC AACGATGTAA TAGCTGATGA CACAAAGCAC AGTGGGTAGG 20580 

ACTTGCGAAG TCACCCTTTT CTTTTCAAAA TTTATACTAA ATCATTGATA TCAGTGTACT 20640 

CACGATTAAG TCCTTGAGCA ACTGGTAGGT TAGTCAAGTA ACCTTGATAA GTAGTCACAC 20700 

CTTGACGCAA GCCTTCATCT TCAGAGATTG CTTGTGCGAA TCCTTTGCCA GCCAAAGCTT 20760 

CGATATAAGG AAGAGTGACA TTGGTTAGGG CGATGGTTGA AGTGCGAGCA ACCGCACCAG 20820 

GGATATTGGC AACGGCATAG TGGAGAACAC CGTGTTTTTC ATAGACGGGT TCATCGTGCG 20880 

TTGTCACACG GTCAGCTGTT TCGATAACGC CACCTTGGTC AACAGCAACG TCAACGATAC 20940 

AGAGCCTGGA CGCATTTGTT TGACCATCTC ATCTGTCACC AATTCCGGTG CTTTTGCACC 21000 

AGGGATGAGA ATGGCTCCAA TCACCACATC AGCATCTCTC ACACTTGCTT CAATGTTGAA 21060 

TGAATTAGAC ATAAGAGTTT GAATTTGACT TCCAAAGACT TCTTCTAGAA CTGAGAGACG 21120 

CTTGGAACTA ATATCTAAAA TAGTCACTTG AGCACCAAGA CCAAGGGCGA TGCGGGCAGC 21180 

ATGTGTACCG ACGACACCAC CACCGATGAT AGTTACTTTT CCTTTTGGAA CACCTGGTAC 21240 

ACCACCAAGT AGAACACCAG AGCCACCAGC TTGCTTAGTA AGGAAGTGAG CTCCGATTTG 21300 

AACAGCCATA CGACCTGCAA CCTCACTCAT AGGAACGAGG AGCGGTAGTT GTCCTTGATT 21360 

GTCACGAACA GTTTCAGTTG TTTTTGCTGT TAACATAGCA TCTGCTAATT CTGGAGCAGC - 21420 

GGCCATGTGC AAGTAGGTGA AGAGAAGAAG ATCGTCGCGC AAGTAACCGT ATTCAGAACT 21480 

TAAAGATTCT TTTACTTTCA CAACCAACTC TGCTGCCCAA GCTTCACCAG CAGTAGCGAC 21540 

AATCTCAGCT CCTTGCTTTT GATAGTCAGC ATCAGTAAAG CCAGAACCGA GACCAGCATT 21600 
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TGTTTCGATA AGGACACGAT GACCACGACT AACTAAGCTA TGAACACCTG CAGGTGTGAG 21660 

GGCGACACGG TTTTCGTTAT TTTTAATTTC TTTTGGGATT CCGATTAACA TTGAGATAAC 21720 

CTACCTTTCA ATTGACGGTC TTGTTTTGGT TGTCACATTC CAGTTCATAA ATCAAAAATG 21780 

TGACGGTOTC ATTGTATATG AAACCGCTTC AAAAATCAAG AAAAACTTGT CATCCAAATT 21840 

TTTTTATGCT AGACTAGTGA AAATCAAGCT CTAATGGAGG GAAAAGTATG GAATCAATAT 21900 

TTGTGAAATT TGCCCAGTAT CCGTCTATAG AAACGGAGCG TTTATTGCTC AGACCTGTAA 21960 

CTTTGGATGA TGCGGAAcAA TGTTTGACTA TGCCTCGGAC AAGGGTAATA CACGTTACAC 22020 

TTTTCCAACC AATCAAAGCT TGGAAGAAAC CAAGAATAAC ATTGCTCAGT TCTACTTGGC 22080 

TAATCCCTTG GGACGTTGGG GAATAGAACT AAAAAGCAAT GGTCAGTTTA TTGGAACCAT 22140 

TGACTTGCAC AAGATTGATT CTGTTCTTAA GAAGGCAGCT ATTGGCTACA TTATCAATAA 22200 

AAAGTATTGG AATCAAGGAT TAACGACAGA AGCCAATCGT GCTGTGATTG AGCTAGCTTT 22260 

TGAGAAGATA GGGATGAATA AGTTGACTGC CCTTCACGAT AAGGCTAATC CCGCGTCAGG 22320 

AAAGGTCATG GAGAAATCAG GCATGCGTTT TTCCCATGCA GAACCATATG CTTGTATGGA 22380 

CCAGCATGAA AAAGGCCGAA TCGTGACAAG AGTTCATTAT GTCTTGACCA AGGAAGACTA 22440 

TTTTGCAAAT AAATAAGCAG TTGAAAAGAA ATTTTTCGAC TGTTTTTTCT TCCTCTTACG 22500 

AATAATCTAA GAGAGGAGAA AATATGGAAG CAATTATCGA GAAAATCAAA GAGTATAAAA 22560 

TCATCGTCAT CTGTACTGGT CTGGGCTTGC TTGTAGGAGG ATTTTTCCTG CTAAAACCAG 22620 

CTCCACAAAC ACCTGTCAAA GAGACGAATT TGCAGGCTGA AGTTGCAGCT GTTTCCAAGG 22680 

ACTCATCGAC CGAAAAGGAA GTGAAGAAGG AAGAAAAGGA AGAACCCCTT GAACAAGATC 22740 

TAATCACAGT AGATGTCAAA GGTGCTGTCA AATCGCCAGG GATTTATGAC TTGCCTGTAG 22800 

GTAGTCGAGT CAATGATGCT GTTCAGAAGG CTGGTGGCTT GACAGAGCAA GCAGACAGCA 22860 

AGTCGCTCAA TCTAGCTCAG AAAGTTAGTG ATGAGGCTCT GGTTTACGTT CCTACTAAGG 22920 

GAGAAGAAGC AGTTAGTCAA CAGACTGGTT CGGGGACAGC TTCTTCAACA AGCAAGGAAA 22980 

AGAAGGTCAA TCTCAACAAG GCCAGTCTGG AAGAACTCAA GCAGGTCAAG GGACTGGGAG 23040 

GAAAACGAGC TCAGGACATT ATTGACCATC GTGAGGCAAA TGGCAAGTTC AAGTCAGTAG 2 3100 

ACGAGCTCAA GAAGGTCTCT GGCATTGGTG GCAAAACAAT AGAAAAGCTT AAAGACTATG 23160 

TTACAGTGGA TTAAGAATTT CTCTATTCCC CTAATTTACC TGAGTTTTCT ATTACTTTGG 23220 

CTTTATTACG CTATTTTCTC AGCATCTTAT CTTGCTTTGT TGGGCTTTGT TTTTCTGCTA 23280 

GTCTGTCTCT TTATCCAATT TCCGTGGAAA TCTGCTGGTA AAGTTCTAAT AATTTGCGGA 23340 

ATCTTTGGAT TTTGGTTTGT TTTTCAAAAT TGGCAACAGA GTCAAGCGAG TCAAAATCTG 23400 
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GCGGATTCTG 


TTGAAAGGGT 


ACGGATTTTG 


CCTGATACTA 


TTAAGGTTAA 


TGGTGATAGT 


23460 


CTATCCTTTC 


GTGGCAAGTC 


TAACGGTCGT 


GCTTTCCAAG 


TCTATTATAA 


ACTCCAGTCC 


23520 


GAGGAGGAGA 


AAGAAGCCTT 


TCAAGCTTTA 


ACTGACCTGC 


ATGAGATAGG 


ACTAGAAGGG 


23580 


AAGCTTTCGG 


AGCCAGAAGG 


GCAGAGAAAT 


TTTGGTGGCT 


TTAATTACCA 


AGCCTATCTG 


23640 


AAGACTCAGG 


GAATTTACCA 


GACTCTCAAT 


ATCAAAACAA 


TCCAGTCACT 


TCAAAAGATT 


23700 


GGCAGTTGGG 


ATATAGGAGA 


AAACTTGTCC 


AGTTTACGTC 


GAAAGGCTGT 


GGTTTGGATT 


23760 


AAGACGCACT 


TTCCAGACCC 


TATGGGCAAT 


TACATGACAG 


GACTCTTGCT 


GGGACATCTG 


23820 


GACACCGACT 


TTGAGGAGAT 


GAATGAGCTT 


TATTCCAGTC 


TAGGAATTAT 


CCACCTCTTT 


23880 


GCCCTATCTG 


GCATGCAGGT 


AGGTTTTTTC 


ATGAATGGAT 


TTAAGAAACT 


TCTCTTGCGA 


23940 


TTGGGCTTGA 


CCCAAGAAAA 


GTTGAAATGG 


CTGACTTATC 


CCTTTTCCCT 


TATCTATGCG 


24000 


GGACTAACTG 


GATTTTCAGC 


ATCGGTTATT 


CGCAGTCTCT 


TGCAAAAGCT 


ACTGGCTCAA 


24060 


CATGGGGTTA 


AGGGCTTGGA 


TAATTTTGCC 


TTGACGGTGC 


TTGTCCTCTT 


TATTGTCATG 


24120 


CCAAACTTTT 


TCTTGACAGC 


AGGAGGAGTC 


TTGTCCTGCG 


CTTATGCTTT 


TATCCTGACC 


24180 


ATGACCAGCA 


AAGAAGGGGA 


GGGGCTCAAG 


GCTGTTACTA 


GTGAAAGTCT 


AGTCATCTCC 


24240 


TTGGGCATAT 


TGCCCATTCT 


ATCCTTCTAT 


TTTGCGGAAT 


TTCAACCTTG 


GTCTATCCTT 


24300 


TTGACCTTTG 


TCTTTTCCTT 


TCTTTTTGAC 


TTGGTCTTCT 


TACCGCTCTT 


GTCTATCTTA 


24360 


TTTGTCCTTT 


CCTTTCTCTA 


TCCAGTCATT 


CAGCTGAACT 


TTATCTTTGA 


ATGGTTAGAG 


24420 


GGCATTATTC 


GCTTGGTCTC 


GCAGGTGGCA 


AGGAGACCAC 


TTGTCTTTGG 


TCAACCCAAC 


24480 


GCATGGCTTT 


TAATCTTATT 


GTTAATTTCC 


TTGGCTTTGG 


TCTATGATTT 


GAGGAAAAAC 


24540 


ATTAAAGGAT 


TAACAGTATT 


GAGTTTATTG 


ATTACAGGTC 


TCTTTTTCCT 


TACCAAGTAT 


24600 


CCACTGGAAA ATGAAATCAC 


CATGCTGGAT 


GTGGGGCAAG 


GAGAAAGTAT 


TTTCTACGGG 


24660 


ATGTAACTGG 


GAAAACCATT 


CTCATAGATG 


TAGGTGGTAA 


GGCAGAATCT 


TATAAGAAAA 


24720 


TCAAAAAATG 


GCAAGAAAAG 


ATGACGACCA 


GCAATGCCCA 


GCGAACCTTG 


ATTCCCTATC 


24780 


TCAAAAGTCG 


AGGAGTAGCT 


AAGATTGACC 


AGCTAATTTT 


GACTAACACG 


GACAAGGAGC 


24840 


ATGTTGGAGA 


TTTGTCAGAG 


ATGACCAAGG 


CTTTCCATGT 


AGGGGAGATT 


CTAGTATCAA 


24900 


AAGACAGTCT 


GAAACAGAAG 


GAATTTGTGG 


CAGAACTACA 


GGCGACTCAA 


ACAAAGGTGC - 


24960 


GTAGTATGAT 


AGTAGGGGAG 


AACTTGCCCA 


TTTTTGGAAG 


TCAGTTAGAA 


GTTCTATCTC 


25020 


CAAGGAAAAT 


GGGAGATGGA 


GGACACGATG 


ATACCCTAGT 


TCTGTATGGG 


AAATTCTTGG 


25080 


ATAAGCAATT 


TCTCTTCACG 


GGAAATTTGG 


AGGAGAAAGG 


AGAGAAGGAC 


TTGCTGAAGC 


25140 
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ACTATCCAGA CTTGAAAGTA AATGTTTTGA AAGCTAGCCA ACATGGCAAT AAAAAATCAT 25200 

CAAGTCCAGC CTTTCTAGAA AAACTCAAAC CAGAGCTTAC TCTTATCTCA GTTGGAAAGA 25260 

GCAATCGAAT GAAACTCCCC CATCAGGAAA CATTGACACG ACTGGAAGGT ATCAATAGCA 25320 

AAGTTTATCG AACTGACCAG CAAGGAGCTA TACGTTTTAA GGGGTTGGAT AGTTGGAAAA 25380 

TCGAAAGTGT TCGATAGGAA GGATAAATGT TGTAGATTAG TGAAATAAAC TAAAAATTTG 25440 

TTGCATAATA ATGATAAAAA TGGTATAATG AAAACGTATT CAATATTGAG GATATAAAAT 25500 

CATTAAAAAT CAGCAAAAGT TGTTTTATTA GTTAGTTTAT AATCTATTGG TCTTCTTCAG 25560 

TCCAGTGTAT CTGCTGTGAC AGTCACTAAA AGTTACAAGT ATGATTGGAA TACGGTTTGG 25620 

GAATATAGTA CCAACTATCA CGACCATCAG TATGCTTGGA TTCCGTCATG GTCTCGTTAT 25680 

GACAGCTATT CTGAGTATAA AGTTGGCGGA GGCTGGAACT ACGCTCGTTA TGAGGTCATA 25740 

AACTATTACA GCGGAGGCTA TTAATTCTTA AAGAGTGAGA AAAAGGAGGG CTAGATATGT 25800 

TGCAGCTTAC TCATGTGACC TTAAAAACGC GACAAGTCAT CTTGCAAGAT GTGGATTTCA 25860 

CCTTTAAAAA GGGTAGGGTT TATGGTCTTC TTGCTATCAA TGGCTCTGGA AAGACGACCC 2 5920 

TGTTCCGTGC CATTAGCAAT TTAATTCCCA TAAGTAGTGG AAATATCGCA GCCCCTCCTT 2 5980 

CTTTATTTTA TTATGAGAGT ATTGAATGGC TGGATGGAAA CTTAAGTGGG ATGGACTACC 26040 

TTCGTCTTAT CAAAAACATC TGGAAGTCAG GTCTGAACTT GAGGGATGAA ATCGCCTATT 26100 

GGGAAATGTC TGACTATATC AGTCTTCCCA TTCGCAAGTA TTCCTTAGGC ATGAAGCAAC 2 6160 

GCTTGGTGAT TGCCATGTAT TTCCTCAGTC AGGCCAAATG CTGGCTCATG GATGAGATTA 26220 

CAAATGGCTT AGATGAGTAT TATCGACAGA AGTTTTTTGA TAGGCTAGCA CAAATCGATA 2 6280 

GACAAGAACA GCTGGTTCTT TTAAGTTCCC ACTATAAGGA AGAGTTGGTT GATGTCTGCG 2 6340 

ATAGAGTAGT AACCATTCAT CAGGGGCAGA TAGAAGAGGT TTAGTTTATG AAAGATGTTA 26400 

GTCTATTTTT ATTGAAAAAA GTTTTCAAAA GCCGCTTAAA CTGGATTGTC TTAGCTTTAT 26460 

TTGTATCTGT ACTCGGTGTT ACCTTTTATT TAAATAGTCA GACTGCAAAC TCACACAGCT 26520 

TGGAGAGCAG GTTGGAAAGT CGCATTGCAG CCAACGAGAG GGCTATCAAT GAAAATGAAG 26580 

AGAAACTCTC CCAAATGTCT GATACCAGCT CGGAGGAATA CCAGTTTGCT AAAAATAATT 26640 

TAGACGTGCA AAAAAATCTT TTGACGCGAA AGACAGAAAT TCTGACTTTA TTAAAAGAAG 26700 

GGCGCTGGAA AGAAGCCTAC TATTTGCAGT GGCAAGATGA AGAGAAGAAT TATGAATTTG 26760 

TATCAAATGA CCCGACTGCT AGCCCTGGCT TAAAAATGGG GGTTGACCGC GAACGGAAGA 26820 

TTTACCAAGC CCTGTATCCC TTGAACATAA AAGCACATAC TTTGGAGTTT CCGACCCACG 2 6880 

GGATTGATCA GATTGTCTGG ATTTTAGAGG TTATCATCCC AAGTTTGTTT GTGGTTGCTA 26940 
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TTATTTTTAT GCTAACACAA CTATTTGCAG AAAGATATCA AAATCATCTG GACACAGCTC 27000 

ACTTATATCC TGTTTCAAAA GTGACATTTG CAATATCCTC TCTTGGAGTT GGAGTGGGAT 27060 

ATGTAACTGT GCTGTTTATC GGAATCTGTG GCTTTTCTTT TCTACTGGGA AGTCTGATAA 27120 

GTGGTTTTGG ACAGTTAGAT TATCCCTACC CAATTTATAG CTTAGTGAAT CAAGAAGTAA 27180 

CTATTGGGAA AATACAAGAT GTATTATTTC CTGGCTTGCT CTTAGCTTTC TTAGCCTTTA 27240 

TCGTCATTGT GGAAGTTGTG TACTTGATTG CTTACTTTTT CAAGCAAAAA ATGCCTGTCC 27300 

TCTTTCTTTC ACTCATTGGG ATTGTTGGCT TATTGTTTGG TATCCAAACC ATTCAGCCTC 27360 

TTCAAAGGAT TGCACATCTG ATTCCCTTTA CTTACTTGCG TTCAGTGGAG ATTTTATCTG 27420 

GAAGATTACC TAAGCAGATT GATAATGTCG ATCTAAATTG GAGCATGGGA ATGGTCTTAC 27480 

TTCCTTGCCT GATTATCTTT TTGCTATTGG GAATTCTATT TATTGAAAGA TGGGGAAGTT 27540 

CACAGAAAAA AGAATTTTTT AATAGATTCT AGCTTTCCTA TAGGTAGGGA AAATAAGTAA 27600 

AAACTAACAT AGAGAGGGAA TCAACTTGAT TCTCTCTTTT TGATTCGAAA ACCAAACCAA 27 6 60 

AATACAAACA CAAACTTTTC AAAAAATAAC TTTTTATCTT GACAAGAGCT AGAAAACTTG 2772 0 

GTATCATATA AAAGTTGAGA AAAGCAGAAG TGAGAGCTTC TCGCCTTGTG ACATTAAGTT 27780 

GCCTGGCCCT ACGGATGAAA AGTTTCGAAG AAACGCTATC ATAACGTGCG GGCTTGTATA 27840 

TTTACAAGTC CGCTATTGTT TTTCTCTAAT AAAACAAAAG AGGTGAAAAC CATAGCAAAG 27900 

CAAGACTTAT TCATCAATGA TGAGATTCGT GTACGTGAAG TTCGCTTGAT TGGTCTTGAA 27960 

GGAGAACAGC TAGGTATCAA GCCACTCAGT GAAGCGCAAG CTTTGGCTGA TAACGCTAAT 28020 

GTTGACCTAG TATTGATTCA ACCCCAAGCC AAACCGCCTG TTGCAAAAAT TATGGACTAC 2 8080 

GGTAAGTTCA AATTTGAGTA CCAGAAGAAG CAAAAAGAAC AACGTAAAAA ACAAAGCGTT 28140 

GTTACTGTGA AAGAAGTTCG TCTAAGTCCG G 28171 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 7147 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

CCGCTCAACT TTTGCAATCA AGGCTAAGTA GACAGCAGCA AATTTCATAT TGTATAATTT 60 

CTGACTCATA CTTCTCTCTT TCTATGTGTA CTAGTATAAA TAAGAAAAAG AAGGCCGTCA 120 



WO 98/18931 



PCT7US97/19588 



290 

AGCCTTCTTT TGATTTATTC TTCTGCTTCA TCTTCTGTAA ATTGACTATT GTACAAGTCA 180 

GCGTAGAAGC CACCTTGCGC CATCAGTTCC TCATAGTTGC CTTGCTCGAT GATATTTCCA 240 

TCTTTCATGA CCAAGATCAA GTCTGCATTT CGGATGGTTG ACAAGCGGTG GGCAATGACA 300 

AAGGATGTGC GTCCTTCCAT CAAACGGTCC ATGGCTTTTT GGATCAATTC CTCTGTCCGT 360 

GTGTCAACAG AAGAAGTCGC CTCATCCAAA ATCAAAAGCG GTGCATCCTT AAGAAGGGCA 420 

CGAGCAATAG TCAATAGTTG TTTTTGTCTT ACAGACAAGG TCACGGTGTC ATCCAAGATG 480 

GTATCATAGC CATCTGGCAA GGTCATAATA AAGTGGTGAA TTCCCACAGC CTTACTAGCT 540 

TCCATCATTC GTTCATCACT AATCCCTATT TGATTATACA TGAGATTGTC TCGAATAGTT 600 

CCTTCAAAGA GCCAGGTATC CTGCAAGACC ATTGAAAAGG CATCATGCAC TTCTGAACGC 660 

GTCATAGCCT TGGTATCCAC ACCATCAATG CGAATACTTC CCTTATCAAT CTCATAGAAT 720 

TTCATCAAAA GATTGACAAT GGTTGTCTTA CCAGCCCCAG TCGGCCCAAC AATGGCAACC 780 

TTTTGACCAG CATGAGCTGT CGCAGAGAAG TCATAGTCTT GAACATTGAC ACCGTCCACC 840 

AGAATTTCTC CTGCTGACAC GTCGTAGAAA CGTGGAATCA GATTGACCAG AGTTGATTTA 900 

CCAGAACCTG TTGACCCAAT AAAGGCCACT GTTTGACCAG TTTCTGCTTT AAAGCTAACA 960 

TGTTCAATAA CTGCCTCCGA ATTTGCCGCA TAGCGgAAGG TCACATCCTT AAACTCGACC 1020 

TGACCTTTGA AGTTTTCATC AGTCAGCTGC ACTTGAACAG GGTTTTGGAT AGAAGAATGC 1080 

AAATCTAAAA CTTGATTAAT CCGCTTAGCA GAGACCATAG TTCGGGGAAG AACGATGAAG 1140 

AGTGCTCCCA TGAGAAGGAA GCCCATGACA ACCTACATGG CATAAGACAT GAAAACAATC 1200 

ATGTCACTAA AGAGAGGCAG ACGCGCTATC GGAGCAGCGT CGTTAATCAC ATAGGCCCCA 1260 

ATCCAGTAAA TCGCCACACT CAAACCACTT GAAATCCCCA TCATGATAGG ATTCAAAATA 1320 

GCCATAAGAC GGTTGACAAA CAAATTCAAA CGGGTCAATT CATCATTTAC TGCTGCAAAT 1380 

TTTTCATTTT GATAATCCTC TGCATTGTAG GCACGAACGA CACGAATACC TGTTAAACTC 1440 

TCACGAGTGA TACTGTTCAG TTTATCTGTC AGCCCCTGAA TCAAGGACTG TTTTGGAAAG 1500 

GCTAGCGTCA TCAAAACGGT CGTCATCAGG ACGTTGATAA TCACTGCCAC AAGTACGGCC 1560 

CAGAGCCAGT ATTCTGAATG ACCTAAAATC TTCCCAATAG CCCAGATAGC CATAATTGAA 1620 

CCACGCGTTA CCACTTGCAA GCCCATAGTA ATCAACATTT GAACTTGAGT AATGTCATTG 1680 

GTAGTACGCG TCAAGAGGCT AGGAATTGAA AATTTCTTAA TCTCTGTCTG CGAGTAATCC 1740 

AAAACTCGGT TAAAAATATC ACTTCTCAGC CTACTAGTAT AAGAAGCCGC CACTCX^GAT 1800 

GCAAAAAATC CAACTGCAAC TACGGACAAG AAGGCAAGAA AGGACATTCC CATCATCATG 1860 

CTTGCCGACT GCCACAACTC ATCTAAATTA GTTTCTTGAC TACCTAGCAA ATCCGTAATT 1920 
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TTCGAGATAT 


AGGTCGGCAC 


TTCCAACTCT 


AGATAGACCG 


AAAAGCAAGT 


AAAGAGAATG 


1980 


GCTAGTAAAA 


TCATCCCCCA 


TTCTTTTCTA 


CTAATTCTTT 


TGGCTAATTT 


CTTTATTCTC 


2040 


TCCTCCTATT 


CCCTTGATAT 


TTTGCCTGTA 


GTTGACCGAG 


AACCTTCTCA 


AAAATCAGTA 


2100 


ATTCATCTTC 


ATCAATGTCT 


TCCATCAACT 


GCTTGTCTAT 


GCGTTCAAAA 


AAAGCCTTAA 


2160 


CCTGTTGCAT 


CTGAGAACGT 


GCTTTGTCCG 


TCAGACGAAC 


AAACTTAGCC 


CGCTTATCAA 


2220 


CAGGACTCGC 


CTCCAATTCC 


ACCAAACCAT 


TTTGCACTAT 


ACGCTTAACC 


AGATTACTAG 


2280 


CAACAGGCTT 


GGTAATATTG 


AGTTCCTGCT 


CGATATCTTT 


AATCAAGACC 


AAGTCTTGGT 


2340 


TTTTCTCGCG 


ATTATCCAAA 


AAACGCACAA 


CCTGACCTTG 


CGGCCCACCC 


ATAAATTCAA 


2400 


TGCCGCAACG 


TTTGGCTTCC 


TTTTGCACCA 


TCAGGTGAAT 


TTGATGACCA 


AAACGCTTAA 


2460 


AGACTAACAT 


CGGTTTATCC 


ATAATCTCCC 


CCTTCTAAAT 


AAAAATAGTT 


CTCTGGAGAA 


2520 


TAATTAAATT 


TCTATGAGAA 


CTATTTTCTT 


GATTAAAAAA 


ATCCCAAGTG 


ATTTTCTCAC 


2580 


TTAGGATCAT 


GTTCTATAGG 


TTAAATTAAA 


ACCCATCTAC 


GTTCGTATAA 


ATCTTTTGGA 


2640 


CGTCTTCGTC 


GTCTTCAAGA 


ACGCTGTAAA 


GTTTTTCAAA 


GGTTTCAAGG 


TCTTCGCCTG 


2700 


ACAATTCCAC 


TTCTGACTGA 


GGAATCATTT 


CCAATTCAGT 


CACTTGGAAT 


TCTTCAATAC 


2760 


CAGACTCACG GAGGGCAACG 


ATAGCCTTGT 


GAAGGTCAGT 


TGGCGCTGTG 


TAAACTGTGA 


2820 


TTGTACCTTC 


TTGTGCTTCT 


ACGTCATCCA 


CATCCACATC 


CGCTTCGAGC 


AATTGCTCAA 


2880 


AGACTGCGTC 


CGCATCTTCA 


CCTCCAAATA 


CAATAACACC 


TTTGTTGTCA 


AAGAGGTAAG 


2940 


AAACAGAACC 


TGAAGCGCCC 


ATGTTTCCGC 


CGTTTTTACC 


AAAGGCTGCA 


CGGACATTGG 


3000 


CTGCTGTACG 


GTTGACGTTA 


GAAGTCAAAG 


TATCCACAAT 


TAGCATAGAG 


CCATTTGGCC 


3060 


CAAAACCTTC 


GTAACGTCCT 


TCTGTAAAGG 


TTTCGTCTGT 


GTTTCCTTTG 


GCTTTATCAA 


3120 


TCGCTTTATC 


GATAATGTGT 


TTTGGCACTT 


GGGCTTGTTT 


AGCACGGTCG 


ATAACGAATT 


3180 


TCAAAGCTGA 


GTTTGATTCT 


GGATCTGGAT 


CACCTTTTTT 


AGCTGCTACA 


TAGATTTCTA 


3240 


CACCAAATTT 


TGCATATACT 


TTAGAGTTAG 


CTCCATCTTT 


AGCCGTTTTC 


TTGGCTACGA 


3300 


TATTGGCCCA 


TTTACGTCCC 


ATTAGGAATC 


TCCTTTTTTC 


ACATTTTAAT 


CTTTCTTATT 


3360 


ATAACACAAG 


TTTTTTTGAT 


TTTCACTAGA 


GGAAATGGAT 


TTTATTAGCA 


AATCAAGCTA 


3420 


GGATAGCACT 


TTACCTGCTA 


AGATGGTCTT 


GCCTTTCTAT 


CTTTATCAAC 


AGGCACTCAT - 


3480 


CCACATTCAA AAAACAAACT 


AGACCATTAT 


CTGCAAATAG 


AAAGTTTCAG 


CCAAGTTTGA 


3540 


CAAAGTCAGC 


TCAAATTACT 


GTTTGAAGTT 


TGTAGATATA 


AGCGACAAAA 


ACAATCATAC 


3600 


TGCACCTTTT 


GTTGACAGTC 


TACTCCAGAC 


ATATCATAGT 


TCAAGTAAAT 


ACTTTGAAAT 


3660 
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TCAACAGTTC TTATAGGCGC TATTGTATTC 


TAAGAAATCA 


ATAGAAGAGT 


TTCTAAGCAA 


3720 


ACCTCTAATA CTCAATAAAA ATCAAAGAGC 


AAACTAGAAA 


GCTAGCCTCA 


GGTTGCTCAA 


3780 


AACACTGTTT TGAGGTTGCG GATGGGGCTG 


ACATGGTTTG 


AAGAGATTTT 


CGAAGAGTAT 


3840 


AATTTACGTG TTCCCAAGAT GGAGAAGTTA 


GACTAGTACA 


CTGGCACTTC 


TAAAACATTG 


3900 


CTAGCAATTG ATTTGTTCAT ATTTAATTTC 


ATTTTTTCCA 


TAAATGGGTA 


TTAGATATAA 


3960 


ACAGCAAAAT ATTTCCGATA CGTGTCGTTC 


TTGAATTTCC 


AATCATCTAA 


AACAAGTAAA 


4020 


GGATAATCAA TCCCCTGTAT ATCAAGGAAT 


TGGCTACCCT 


TTTTACTTTT 


TTACACATTC 


4080 


TGTTTGATAG ATTCATTTTA ACATCACGAG 


CATACTCCAA 


TGGAAATCGC 


TAGGCAAGAG 


4140 


ATAAACTTTC AGATATCCGC AGAGAGATCA 


TCGCCTCTTT 


TTGTCGCAAG 


CATTCTCCTC 


4200 


TCCTAGTCAT TTTCTACCTT ATCTTCTACC 


TGAGGATAGA 


GAGTTGTTCC 


CCAAATAGAA 


4260 


ATCGTCCGCT TACGCACTAG TGGCAAATCG 


GTTTTTTCAT 


AAACCGTACG 


CCACCATTCC 


4320 


CAGGCAAGCC CGGTACACTC TCTAATTTTG 


ACAGAGAGAT 


TACGAACATT 


CCCTTTTAAA 


4380 


GGAATACTAG TGGTAAAGTG AGCCGTTAAA 


TCCTGCCCAT 


TTCTGTCCCA 


AGCCTTAGGA 


4440 


GTCAAGACTT CCTTACCTTG ATGATCATAG 


GATAATTCAT 


TCCAAGTAAT 


ATAATATTGG 


4500 


GCAACATAGG CACCACTATG ATCCAGCAGT 


AAATCTCCGT 


TTCTGTAAGC 


TGTAACCTTA 


4560 


GTCTCAACAT AGTCTGTACT ATTTTGAAAG 


GTCGCAACTA 


CATTGTCACG 


TAAAAAAGAA 


4620 


GTTGTATAGG AAATCGGCAA GCCTGGATGA 


TCTGCTGTAA 


AGCGACTGCC 


TTCTTGAATC 


4680 


AAGTCCTCTA CCATATCCAC CTTGCCTGTT 


ACAACTCGGG 


CACCCGAACT 


TGGGTCGCCC 


4740 


CCTAAAATAA CCGCCTTCAC TTCTGTATTG 


TCCAAAATCT 


GTTTCCACTC 


TGTCTGAGGA 


4800 


GCTACCTTGA CTCCTTTTAT CAAAGCTTCA 


AAAGCAGCCT 


CTACTTCATC 


ACTCTTACTC 


4860 


GTGGTTTCCA ACTTGAGATA GACTTGGCGC 


CCATAAGCAA 


CACTCGAAAT 


ATAGACCAAA 


4920 


GGACGCTCTG CAGAAATTCC TCTCTGTTTT 


AAATCCTCTA 


CCGTTACAGT 






ACATCTCCTG GATTTTTAAC AGCATCTACG 


CTGACTGTAT 


AATAAATCTG 


CTTAAAATTA 


5040 


ACAATCTGAA TCTGCTTTTC GCCTGAATGG 


ACAGAGTTAA 


AATCAATATC 


AAGAGAATTC 


5100 


CCTGTCTTTT CAAAGTCAGA ACCAAACTTG 


ACCTTGAGTT 


GTTCCATGCT 


GTGAGCCGTG 


5160 


ATTTTTTCAT ACTGCATTCT AGCTGGGACA 


TTATTGACCT 


GACCATAATC 


TTGATGCCAC 


5220 


TTAGCCAACA AATCGTTTAC CGCTCCGCGA ACACTTGAAT 


TGCTGGGGTC 


TTCCACTTGG 


5280 


AGAAAGCTAT CGCTACTTGC CAAACCAGGC AAATCAATAC 


TATAAGTCAT 


CGGAGCACGA 


5340 


TCGACCGCAA GAAGAGTGGG ATTATTCTCT AACAAGGTCT 


CATCCACTAC 


GAGAAGTGCT 


5400 


CCAGGATAGA GGCGACTGTC GTTGGTAGCT 


GTTACAGAAA 


TATCACTTGT 


ATTTGTCGAC 


5460 



WO 98/18931 



PCMJS97/19588 



293 



AAGCTCCGCT 


TCTTTCTTTC 


GATAACAACA 


AACTCATCGG 


GTAGCTGATT 


ACCCTCTTTG 


5520 


ATGAAACGAT 


TTTCAATACT 


TTCTCCCTGA 


TGGGTCAAGA 


GTTTCTTTTT 


ATCGTAATTC 


5580 


ATAGCTAGTA 


TAAAGTCATT 


TACTGCTTTA 


TTTGCCATCT 


TCTACCTCCT 


AATAAGTTCC 


5640 


TGGATTGAGT 


TGCATAAACT 


CAGACTTGTT 


CAGCGAAATC 


AGCCGTGGTT 


GGACTAAGTA 


5700 


ATCCAAAATT 


TCCTCGTACA 


ATTCTTCTGA 


GACATTGCGT 


CGCCGTCTGG 


CTAAATAAGA 


5760 


AGTCGGAATG 


ACCGTATTAT 


CCAACATAAA 


TACCTTATCT 


AAGTCAATCA AGGTTGGTCT 


5820 


TGTAAAAGGA 


TTACGAGCTA GATCCGGCTC 


TTCTATCATA 


AAGTTCTTGA 


CCAAACGTCT 


5880 


GGTCAAGAGA 


GCTGGTTTGA AGGTCTGATT 


TTTAACCAAC 


TCTTTGTTTT 


TAGTCATGCT 


5940 


GTTGTCAATA 


CAGATATACA 


TATGATTCTT 


CACAGCCAAA 


TCGCTACTAA 


TAGTCGGAAA 


6000 


AGGCAAATAA 


AGAGCTACAA 


CATCTCCTCT 


CTTAATCAAG 


CAAGAGCACC 


CCCTTTTCTC 


6060 


CTAATGTAAC 


ATAGACAGGA 


TTGACCAAGT 


CTTCTGATTG 


ACTCAGAATT 


TCCAAAGTTT 


6120 


GAGTTTGGCG 


CGCTGTCAAT 


TTAGTAGCAT 


CTTGTCTCTT 


CAATACAAAA 


TGCTTGTCGC 


6180 


CAATAACCTT 


GACAATATAA 


TCCTTCTCCA 


AAGCTGACTG 


GTAAATCCAC 


ATCAGATGTT 


6240 


GTCTGTCCTG 


AGAACTCAAG 


AGAGAAGGAT 


TTTCAAGCCT 


CCCGATAGTC 


TGATAAAAAT 


6300 


CAAAAACAGG 


AGCTAACTCC 


TGCCAATCTG 


ATTGGCTAGT 


TGTCAAGGCT 


AGAAAAAGGG 


6360 


CTTTGCGAGC 


TGATACTTCT 


TGGTTAGCCT 


TGAGAGTTAC 


TTTCCCCTCC 


AAGTTTTTTA 


6420 


GAAATCGGGA 


AACTCCAGAA 


AGCAAATTTT 


TCTCTAACTG 


CGAGAAATAA AAACCTTTCG 


6480 


TTCCCAGACA 


TAAGTCTTTC 


ATGTCGCTTT 


CTCTAGCAAA 


TAAGAGCTCA 


AACATTTGAT 


6540 


AGTAAAAGAA 


AAATATCTGG 


CACTGGGTCG 


CGCTCATCTT 


TTCCTTATCG 


GCTTCTTTTT 


6600 


TTAACCAGAG 


CAAGGGCGAC 


AGGTAGCTGG 


ATTGAGACAT 


TTCCTCTACC 


TCCTACTCTT 


6660 


TTTTAACTGG 


AGCATCTGCA 


CTAGCTGCCA 


CTTCTTTTGA 


CTGGATACTT 


TCCCACTGGT 


6720 


TAATCTCCTC 


TGAGATAAGA 


CCTTCGCATG 


TCTTGACAAA 


TAGGGCAAAA 


GCCTTGGTCT 


6780 


TTCCTGCATA 


TTTCTCCGTT 


TGGCATTGAT 


AGAGGAATTT 


TTCTTTCTCC 


AGGAGTTGCG 


6840 


CAGTTTTTTG 


GTAAGAAATC 


CAATTTTCCT 


TTGCATTATA 


CAAATTGATA ATCCCCTCAC 


6900 


ACAGCAAGCC 


GAGACTGGAT 


AAGGCAACCG 


AAATCAAACG 


GTAGCGATCA 


CCTGGCATAG 


6960 


GAATAGCACA AAAGACAGCT ATGAGGAAAC 


CTGCCACGAT 


TTCTGTTATT 


TTTAATACCT - 


7020 


TATAGCGCCT ACGATGTTGA ACGCTTTTCT TTAAAAAATG 


AGCTATCTGT 


ACGTCTAATC 


7080 


GCTCTGTCAG 


GTACATTTCT 


TCTGGCGTCA 


TATTCGTAAC 


TCCTTTCATT 


TACTTTGATA 


7140 


ATCAGGG 












7147 
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(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 755 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

CCGCATGGGA TTGGTGTCCT TTTGGGCAAT CTCTTTGACC AAACTGGAAA CATGTTTTAT 60 

GCGCCTGCCT TTACTGCCCT TGTCGGCGGT ACGTCTATAT GATCCTAGTC GCAAAAGTTC 120 

CGCGCTTTGG AGCCATTACC ACTATCGGCC TTGTCATTGC CCTCTTTTTC TTGGGAACTA 180 

AACACGGTGC TGGTTCCTTC CTTCCTGGAA TTATCTGTGG CCTCCTAGCA GATGGAGTAG 240 

CTCATTTAGG AAAATACAAG GACAAAACAA AGAACTTCCT TTCTTTCATT ATTTTCGCCT 300 

TTAGTACAAC AGGACCAATC TTGCTTATGT GGATTGCGCC CAAAGCCTAT ATGGCTACTC 360 

TTCTGGCAAG AGGAAAATCC CAAGAATATA TCGACCGTAT CATGGTCGCT CCAAACCCTG 420 

GAACTGTCCT TCTATTTATC GCAAGTATTG TCATCGGAGC CCTAGTGGGT GCCTTGATTG 480 

GACAAGCCTT GAGTAAAAAA TTTGCCCAGA AAATCTGATC AGTTAAAAAG AGCCACGCGG 540 

CTCTTTTTTA TTTATGGCTC AATTTCTTAG TCAAGAAATC TCCCAAGAAT TGGATTGCAA 600 

AGATAATCAA AATGATAATA ATGGTTGCCA AGATGGTCAC ATCGTGATTG TAGCGGTTAA 660 

ATCCATAAGC GATGGCTACG TTACCGATAC CACCAGCTCC AACCGCACCG GCCATAGCTG 720 
TTtcCCAACA AGGGaAtCAA GGTcACAGTC GTCAC 
(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3010 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



755 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
TTCAATTGGT ATCTCAATCA ACGGTCTTCA CATGGTTTCA ACTGGTTTGA CTCTTGAAAA 
AGCGAAAGCT GCTGGTTACA ACGCAACTGA AACAGGCTTT AACGATCTTC AAAAACCAGA 
ATTCATGAAA CATGACAACC ATGAAGTAGC AATTAAGATT GTCTTTGACA AAGATAGCCG 
TGAAATTCTT GGTGCCCAAA TGGTTTCACA TGATATTGCA ATTAGCATGG GAATCCACAT 
GTTCTCACTT GCTATCCAAG AGCATGTGAC AATTGATAAA TTGGCATTGA CAGACCTCTT 



WO 98/18931 



PCT/US97/19588 



295 



CTTCTTGCCA 


CACTTCAACA 


AACCATACAA 


CTACATCACA 


ATGGCTGCCC 


TTACGGCTGA 


360 


AAATTAAAAA 


TGAATGAGCT 


ATCTGGCCTT 


AAGTTAAGGT 


CAGATAGTTT 


TTAGCTAATT 


420 


TGTCCCCATA 


CAATTATAGT 


TTTTTTATCT 


TGTGCTTCAT 


TCTGTTCTGA 


CTTAAAATGA 


480 


AAAGGTAGCT 


ACCAATACAA 


ATGATGAGGA 


TAAAACAAAT 


GACTGAAAAT 


CGTTATGAAC 


540 


TAAATAAAAA 


CTTGGCACAG 


ATGCTCAAGG 


GTGGTGTTAT 


TATGGATGTG 


CAGAATCCTG 


600 


AACAGGCTCG 


TATCGCAGAA 


GCTGCTGGTG 


CGGCAGCTGT 


GATGGCCTTG 


GAACGAATTC 


660 


CGGCTGATAT 


TCGTGCAGCT 


GGAGGAGTTT 


CCCGCATGAG 


CGACCCAAAG 


ATGATTAAGG 


720 


AAATCCAAGA 


AGCGGTTAGT 


ATTCCAGTAA 


TGGCTAAGGT 


CAGAATCGGG 


CATTTTGTTG 


780 


AAGCTCAGAT 


TTTAGAGGCT 


ATTGAAATTG ATTATATCGA 


CGAGAGTGAA 


GTTCTATCTC 


840 


CAGCTGATGA 


CCGTTTCCAT 


GTGGACAAGA 


AAGAATTCCA 


AGTTCCTTTT 


GTCTGTGGTG 


900 


CTAAGGATTT 


GGGTGAAGCC 


TTGCGTCGTA 


TCGCTGAAGG 


TGCTTCCATG 


ATTCGTACCA 


960 


AAGGAGAACC 


AGGGACAGGG 


GATATCGTCC 


AAGCTGTTCG 


TCATATGCGT 


ATGATGAATC 


1020 


AGGAAATTCG 


CCGCATTCAA 


AACTTACGTG 


AGGACGAGCT 


TTATGTTGCT 


GCCAAGGATT 


1080 


TGCAAGTCCC 


TGTAGAATTG 


GTCCAATATG 


TTCATGAACA 


TGGAAAATTG 


CCAGTTGTAA 


1140 


ATTTCGCTGC 


TGGAGGTGTT 


GCAACGCCAG 


CAGATGCTGC 


GTTAATGATG 


CAATTAGGGG 


1200 


CAGAGGGGGT 


CTTTGTCGGT 


TCAGGTATTT 


TCAAGTCAGG 


AGATCCTGTT 


AAACGAGCGA 


1260 


GTGCCATTGT 


TAAGGCTGTG 


ACTAACTTCC 


GTAATCCTCA 


AATCCTAGCT 


CAAATCTCTG 


1320 


AAGATTTAGG 


AGAAGCCATG 


GTTGGTATTA 


ATGAAAATGA 


AATCCAAATT 


CTCATGGCTG 


1380 


AACGAGGAAA 


ATAGATGAAA 


ATCGGAATAT 


TGGCCTTGCA 


AGGGGCCTTT 


GCAGAACATG 


1440 


CAAAAGTGCT 


AGATCAATTA 


GGTGTCGAGA 


GTGTAGAACT 


CAGAAATCTA 


GATGATTTTC 


1500 


AGCAAGATCA 


GAGTGACTTG 


TCGGGTTTGA 


TTTTGCCTGG 


TGGTGAGTCT 


ACAACCATGG 


1560 


GCAAGCTCTT 


ACGTGACCAG 


AACATGCTAC 


TTCCCATCCG 


AGAAGCCATT 


CTATCTGGCT 


1620 


TACCAGTGTT 


TGGGACCTGT 


GCGGGCTTAA TTTTGCTGGC 


TAAGGAAATC 


ACTTCTCAGA 


1680 


AAGAGAGTCA 


TCTAGGAACT 


ATGGATATGG 


TGGTCGAGCG 


TAATGCTTAT 


GGGCGCCAAT 


1740 


TAGGAAGTTT 


CTACACGGAA 


GCAGAATGTA 


AGGGAGTTGG 


CAAGATTCCA 


ATGACCTTTA 


1800 


TCCGTGGTCC 


GATTATCAGT 


AGTGTTGGTG 


AGGGTGTAGA 


AATTTTAGCA 


ACAGTGAACA 


1860 


ATCAAATTGT 


TGCAGCCCAA 


GAAAAAAATA 


TGTTGGTAAG 


TTCTTTTCAT 


CCAGAATTGA 


1920 


CTGATGATGT 


GCGCTTGCAC 


CAGTACTTTA 


TCAATATGTG 


TAAAGAAAAA 


AGTTGAGATT 


1980 


GAATTTCTCA ACTTTTTTAC 


ATGTAATAAA 


CAATAGCGAT 


GTATTGAAGT 


GCGGACGCAG 


2040 
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CTAGGATAAA GAGATGCCAA ATCATGTGGA AATAAGGTTT TTTCTTGGCA TAAAATCCAG 2100 

CTCCAACTGT ATAACAGAGT CCGCCAGTTA CCATGAGACT CCAGAAAACG GGTGTCGTTT 2160 

GACTGATAAT GGCAGGAATG ATAGCCAGAA CCAACCAGCC CATAATCAGG TAAAGAGCAA 2220 

GGCTAAATTT CTCATTGACC TTTTTAGCAA AGATTTTATA GAGAATACCA AAGATGGTCG 2280 

TTCCCCATTG GATGACAATA ATCAGATAGC CAAACCAGTT ATTCATCAAG GTCAAGACAA 2340 

CGGGCGTGTA TGAGCCGGCA ATGGCAACGT AAATCATAGA ATGGTCAATG ATTCGCAAAA 2400 

CATATTTGTG GGTCGAACCA TAGGCCATAG AGTGATAAAT GGTGGATGAT AGGAACATGA 2460 

GAAAGAGACT GATGACGAAA ATGGAAACGC CGATAGAGGA TAAAAATCCG TGTGCTTCAT 2520 

AACTATAGAT GGATGAAATA GGCAGCAAGA TAAGCATGAT GACTGCACCC ACAGCATGGG 2580 

TCACGCTATT AGCAATCTCC TCTCCAAAAC TGAGTTGTTT GCTGAGTTTA AGACTAGTGT 2640 

TCATTGGATT ACCTCCTCTT GAGTATGATC GATTAAGTCT AGAGTTTGAT GATAGAGTTT 2700 

AACGGTTTGG CAGCTGGTTT GGATAATAGG GTTAGCTGGG TCAATTCCTT GGTTCATGTA 2760 

GTCCACAAAA GCATCGTAGA GTTGGTCTGA ACTTGCTTGA GTTTGTAGAG TATTAAGTGT 2820 

CTGGGCTATT TCTTGAATAG AAAATACAGA CTTGAGGGTT GTGATAGCAA TCAAACGGGC 2880 

AATCTGTTGG CGTTGGTATT TTTTTTTGTC AGGCTTTGTC AGGTAACCAT TTTTCACATA 2940 

ATTGTTGACC ATAGATGCTG TTAGGCCCTT GTCTTTATTA GGAGAGATAG GGGCGCAGAC 3000 

CTGATTGACA 3010 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15213 base pairs 
{B) TYPE: nucleic acid 
{C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

CATAAATCGG TGCAAATAAC TTAATAGTGA AGTAGCCATT TCTTTCGTAT TTACCTGAGG 60 

CATATTCCCT AGACGAAAGA ATATTATTAT CAATCAAATC ATTGAATGAA CGTAGTCTTT 120 

CAACTTCTTC TACTGTTAGA TTTCTGACAA CATTTGTTGC ATAGACCTTA TTTCCATCAG 180 

GATCAGGATG GTACTCATTT GTAACTTTTC TAAGAAGTTG TTGTTTTTGA TTCGTATCCA 240 

ATTTAAGAAT TGAATTTCCT TCGAGATATT CCAACATATA AACAACGTCA AACATGTTGT 300 

GGACATATTG CTTCAAATCA TCTGCATTAT TAAATCTTGT AGTTGGATCA AGTACTTGTA 360 

ATCGTCGACT TTCTGTACTA TCAGATTTTG AATGTTTCAA GATGGAGTTG ATGGTAATGG 420 
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TCGCATCATC TGGATGGTCT GGTGCTTGTA ATAATCCTTT AGCAAAGAAC TCTGGTCCCA 480 

AGCCACTTCT TCGACCATAT CCTCCAAGAT AAATGTCCTG ATCTGAGTCA TGTGTCATCT 540 

CATGCGTATA AGTAATAGCT CCATCCTTAT CCAACATTCG ATAACCCATA TAATAAACTG 600 

CATCACCTGT AGCATAAGCA CCGTGTTGAT TATGCCCAAC TTTATTTCCA ACAGGTCCAA 660 

AGAAATGTTG CATTGCAGGA TTTGGATTAT CAAAATCTGC CACTTCTGTA GCTTTCCCTA 720 

CGGTATTATC ATCGCCAAAT TTATAAGCAT CGTAAAGCAA AATATTTCTA TAAAGTTTTT 780 

CACGTGCATT GTCGTCTAAA ATACGATACC AATAATCGTA GTGATCTCGC TGACGTTTGG 840 

CTGTTTCACG CGCATTTTCT TCAACAAAAT CATTGAGAGC CTTGCCCGCT TTATGGTCAC 900 

TACTGCGGTA GCGATCATAA GCTCCAAATC CTAGACTAGA CATGGTCGAG ATGACAAATA 960 

CGGATCTCTC TGGCAAGGTC AGGAGAGGCA AGACCATATT GCGGTATTTC CATGTGGCAC 1020 

TCGTGATACG ATCATAAACA CCGATAGAAT ACTTGGTGCC AGCTAACCCT TGCTTCGTTT 1080 

TCACCTCTTC GATAGTGGAT TTTTCTTCGA CAATGTAAGC CTTAGTCTCT GATTTAAACC 1140 

AGTCATTATT GCTTGTATTT GGTAAAAAGA CTTTTCGGTA ATGTTCCAGC GTGCTAAACA 1200 

AATCTGTCGT TCCATGTTGA CTGGCAAGAC TGATACCATA AGTATCGACA TTATTCTTAG 12 60 

CTAGAAGATT GTTAAAGCCA GATTTACCCA ACTCAATCAG AGTATCTAAT GGTGAAGCAT 1320 

TCCCCTTACC AAAGAAGTCC AAATGGTACA GAACTAGGTC TTTGACATTC ACCTGACCAT 1380 

AGCTAAAGTT ATACCACCGT TCCAGATAGG TCAAGCCAAG TAGCAAGGCT TCCTTGTTGC 1440 

GTTTGATTTT ATCTACAAGA TAACCTTCAG TGACGGGGTT AGCACTAGCC AGTCCAGCAT 1500 

CCGCTGACAA GAGTTTTTTC AAACTGTCTT CCAGTTGTTG TTTTGTTTTG GCGAACTGGT 1560 

CTTCTAGATA GAGCTCAGTT TGCTTGACGT TTGGAGAAAT ACCCAGCGTC TTTCTGATGG 1620 

CTTCTGAATG ATAGTCAACC TTTTGTAAGT CAGGTAAGAC TTGCTTGATG ATAGAGGTTT 1680 

GGTCATACAG GAATTGGTTT GGCGTATAGA GAAGTCCAGT ATTGCCCAGA CTATATTCT^ 1740 

CTAATTTGGC GAAATCATTC TGGTATTTGA GATCCAGCTT CTCAGATAAA TCATCCTTGT 1800 

AGTGAAGCAA GAGTTTGTTT GCAGTCTGTT TGTTAGAAAC AATGTCTGTG ATGACTTGGT I860 

TGTCCTTCAT CATGACTGCT GACAAGAGTT CTTTTTGATA TAAAAGACTG TTCTCATTGA 1920 

CCAGGTTTCC GTATTTGACG ATGGTTGCCT TGTTGTAGAA AGGTAGCAAT TTTTCAATGT - 1980 

TTTTATAAGT CAAGTTGCGC TTAGCTTGAT AATAGGCCAC CTTAGAAAAA TCACTGTCTT 2040 

TTTTGCCACT TGTTGAAAGT GGCTCCACTG TTGGTAAAAT GAGAGGATTG ATTTCTGCTT 2100 

TTTTGCTTGC AATTTGAGAA GCATCTAGCA TTGTTCCTCT TTCTTCAAAG GATTCCTTGC 2160 
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TGACGACCTC 


ATCCTTGACC 


AAGGTGACAT 


TGTAGACTCT 


GTTGGCCTTG 


CTGCTGAATG 


2220 


TGTCCTTTAC 


CTTCATTTCG 


TTATAGTGGT 


AACCAGTGAT 


GGCATTTCCG 


TTGGTTACAT 


2280 


TAACATCGCT 


GAGAACATTG 


GTCAAACTTC 


CAGCATGCCT 


AACATCACCA 


GAAGTTCGAT 


2340 


CCCACAAATT 


GCCTGCCACT 


CCAGCGACTC 


TACCAAAGTG 


CTTGACATTG 


TTGATATCAC 


2400 


CTTCAGCATA 


GCTATCTTGG ATCTGTGCAT 


CTCGGTCTAC 


TAGGCCTGCA 


AGTCCACCCA 


2460 


CAGTCTGATC 


TGAAGTATTT 


GTGTTAGATG 


AAATGGCTAC 


TGTCGCTTTT 


GACTTAGTAA 


2520 


GTAAAGCCTT 


GTCACCTGTC 


AAATGACCGA 


CCATACCACC 


GATATTGTAG 


GCAGCAGTCG 


2580 


TTTCATAAGT 


GTTGATAATT 


CTTCCCTTGA 


AACTGCTCTC 


TGTGATGCTT 


GATTGCTCAG 


2640 


CCTTAGCCAG 


CAAACCACCG 


ATACCACGTT 


CACCAGCCAG 


AACACCATCG 


ACGTGAACTT 


2700 


GCTTAATTTT 


TGTGTTATTC 


TGAGCTTCAT 


TTGCCAGTGA 


ACCGATATCA 


TCTTTCCCTG 


2760 


AAATAGCAAC 


ATTTTTTAGA 


CTCAGTTTTT 


CTACTGTAGC 


ACCACTCAAG 


TTTTCAAACA 


2820 


GAGGTTTTTT 


CAAATTATAG 


ATAGCATAAT 


TCTTGCCATC 


TTTTTCACCG 


ATTAAACGAC 


2880 


CAGTAAAGGT 


GTCCTTGATA 


TAGGATCTTT 


CATCAGGACC 


AAGCTCCACT 


TCGTTAGCAT 


2940 


TCAGGCTGGC 


CGCTAAATGA 


TAGGTTCCAG 


AGGGATTTTG 


GTTTATAGCT 


TTGACCAGAT 


3000 


TACTAAAGGA 


AGTAAAGTTT 


GTTGTTTCTT 


CTGTTCCCTT 


CTTAGCTAGA 


TAGAAGGTAA 


3060 


AATTATCTTT 


ATATCTGCTT 


TCTATCTCCT 


GCTGAAGCTT 


CTCTACTTTT 


GCTGTGATTT 


3120 


TATAAAGGAT 


TTTATCATTT 


TTTCTTTCCT 


CTGATATTGA 


TGCTACTGGT 


AGGTATACAT 


3180 


CTTTGAATGA 


AGAAGATTTC 


ACTTTAACAA 


AGTAGCTATT 


TGGATTGCTT 


GGAACTTGCT 


3240 


CTAACGAAAT 


GTGTTGTTTA 


TAAGTACCAT 


TTGACAAACT 


GTATAACTCT 


AGGTCGGAAA 


3300 


CATTTCTTAA 


TTCAAGTGTT 


TTCTCTGGTT 


CTTCTACCTT 


TTTATCAGGG 


TCTAGTTCAT 


3360 


TTTCTTGTTT 


AATTTCTTCG 


TTTCCATTTG 


AATTGGATGT 


GTTTGATTCG 


GTTGAAACAT 


3420 


CCTCAGTTGA 


ATTTCCGTTT 


GATGGTTCTG 


w A X \~ 1, \J X X A w 


i. ViwVi 1 L iV< X 


n & t n ttv^ t> & fp 

uAJ u 1 Iaj i AX 


3480 


TACCTGAATT 


TTCTGGTTTT 


GTTGCAGTTC 


CGTTTTTTTC 


TGGTTGATTT 


GATTCTTCAA 


3540 


CTGGTGGTTT 


TGAATCACTA 


GGTTTATTGG 


ATACTTCTCC 


AGTATTTTCG 


TTAGCTATTT 


3600 


TCCCAGAGTT 


TGTTTGTGTT 


TCTTCTGCAG 


GTTGAACTGG 


TTTTTCTGTT 


TCTTGATTTG 


3660 


AGGTACCTTC 


TACTGTGCCT 


TCATTTGGAT 


TTACTGGAAC 


TTCTTCTACA 


GTTTTTTCTG 


3720 


AATTTTCATT 


TTTAGAGTCA 


TTATGTTCTG 


GTTTATTTGA 


TTCTCCAACT 


GAGGTTGTCG 


3780 


AATCACTAGG 


ATTACTGGAC 


ACTTCCCCAG 


TATTTTTGCT 


AGATGTATCT 


GGTGATACTT 


3840 


TCTCTGAATT 


CGTTGTTGAT 


TCTTCTGCAG 


GTTGAACTGG 


ATTTTCTGCT 


TCTTGAATTG 


3900 


AGGTTCCTTC 


TGTAGTACCT 


TCATTTGGAT 


TTACTGGTGT 


TTCTTCTGTT 


GGTTTTACTG 


3960 
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GAACTTCTTC AGTTTTTTCT GGACCTTGTT CTTTGGTCTT CTCAACCGGA CTTTCAGGTT 4020 

TTACTTGCTC AATATTACCC TTATATTCTG GAAGCGGTGC TACCTGCTCT GGTTCACCTT 4080 

TATCACTTAC CACAGTATCT GGCGACTCTG GTTGAACCTC AGTCTCACCT TTGTCGGTCA' 4140 

CAACTGCTTC GGGTAATGTA GGTTGAACTT CTGGTTCGCC TTTGTCACTT ACTACAGCTT 4200 

CGGGCAACTC AGGCTGAATT GCGGGTTCAA CAATAGCTCC AGACTGTACG TCCTTATGTT 4260 

CTACACCAGT CTCAGGTTGT TCCTTTATAA CTTGAGTTTT TTTAGTACCT TTTTCGACTA 4320 

TTCTTGGACT AGGCGCAGTC GTTGAAGTTG AAACAATTTC TCGCGAAACT TCTTCCTTGT 4380 

TTACAGAGAA TATTCTGACG ATTTCAACTT TCTTACCTAA TTTACCTTCT TGTTTTACTC 4440 

TTACAGTTCC TTCAGCTAAA TCAGGATTTT CTTGAATTTC TTCTTGAAAA TCTATTTTTG 4500 

TCTCCATAGT TTCCTCACGA TATAAGAGTT CAGGTTTGTT CAATTGACCT GATAAAACTT 4560 

CATCCTGTGG ATTTAATGTA TTTACCCCAG TCTTTTCTTT TGGAGAAATC TTCTCCTCTT 4 620 

TCTTCGTTTC TAGATTCTTA TGTTCGGCTA ATTGTTCTTG AGAATCTGAA GATTGTTTCT 4680 

CTTCTTTTCT TGGATTGATT AATTCAGTAG AGAAAGGTTT TTCAACTACT TGAACTTCTG 4740 

TCGGCTTAGT TGAAGAAACA GGTGTTTGTT CCTGAATAGC TTGTACTGTT GATGGATGGT 4800 

CTACAAAATT CGGTGTAACA TTATAATCCA CCTTTTGTTG TTTTGTAGGA GTGGCAACTG 4860 

AACTCTTTTG ATTACTTACT TCAGACTCAG AAGTCGTTTT TCCCTCTTTG ATATATCCAA 4920 

TATAAGTGTA ACCTGAAATC TCTTTAGGAA GAGGTAATTT TTCTCCAGAG GTCAATTCAT 4980 

AGTCCGTATT GTAATTTAGC AAAAGATGAT TTTCTAAAGC ATGGACTGAA ACTAAGACAC 5040 

CATTTCCTAT CCCTGCAACC AATACTAAAT GTAATACCGT TTTATTCTTA ACCTTTTTCT 5100 

TGGAAACAGC AAAAATTAAA ATTCCCATAG CAGCTAAGCT AGCACCAGCA ACTAGGGCTT 5160 

GCCTCTCATT CTTGCTTCCA GTATTTGGCA ATTCCGCCAG TTGATTTTGA GAATTTAACT 5220 

TATAAACAAG ATAATAAGTT TCATCATCAT TCTCCACGTA TGTCGGAATA TCATAGACAA 5280 

GCTGCTTCTT TTCTTCTGAT GATAGCTCTG AATCTGCCAC ATATTTATAG TGAACTCCCG 5340 

CAGTTTCTTG AGCATCCACA GATGAACTAG CTAATACAGA CATAAAAAAT AAACTTGAAA 5400 

TCGTTGCAGA TACAAGTCCT ACTGATAATT TTCTAAATGA AAAACGCTCT TGTTTTTCAC 5460 

CAAAATACTT TTCCATTATT CCTCCTTGAA ATAAAATTTA TATATGTTAC AAAGACCTTT * 5520 

ATTATATTAG TGTATTATCT ATTATCTATA GAAAAGGCAG TATACCTTAA TTATACTCTT 5580 

AATTTACAAA AAAGTCTTAA AATTGAGATG CGCTTTCATA CTTTGTTTTA TATTATTTGG 5640 

AGGTACAATA ACACCTACCA TGAAATTTAC ACGGTAGGTG TTACTCATAT CACTAATCGT 5700 
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TCTAAAAATG GTTTGAGGCA GTTGAGGAGA ATTCCTTCTA TCCAGCTTCC TTGTGCTGAT 5760 

GAGCGATGGT CTTCCTGCAG GCTTTTTTTT AGAAAATCTC GGACTTGTTC TGGTGCGATT 5820 

TCAAATTCAA AGGCTTTCAT TTTATAGAAA AAGTCGATGA GATGATCTGA CAGGTATTCA 5880 

GTTGAAAAGG GTACTTCACC ACTTTTTCTA TATTCTAATA AGAGTCTAGA AAATCGAGCT 5940 

TTTTCTTCAG GAAGCTCACG AAAATAGGAA TTGAGGATCC AAGTCTGCTT CTGTTTTCTT 6000 

TCAATTGGAT CCTGACTGGC AATTCGTTGG TCTTTTTCCA GCTCTTTTTG GTATTGTTTG 6060 

GCCTTGATAG CTCGTTCTGC TCTATTTTTA CCAAAAAGAA TTTTTTCCCA CTTGCGTTCT 6120 

TCTTGAGTCA GGGTCTCTGT AAAGCCAAAG TAATCTTGAT AAGCACGCTC TGCGGGTCCC 6180 

ATGGCTAGAA CCAGATTGTC TGCATATTGC TTGGCGATTT TATCCCTCTT CTTGCGTTCT 6240 

TTCTCTGCCT GGATACGGAG TTCTTGTTCG TAGTCAATTT TCTCCTTGCC TAGCTTGACA 6300 

AGGTAGAGTT GGTCATCCGA TTTCCCAAGT AAAAAGGGTT TGATACACTT TTCAAGGACT 6360 

TCTTCCATCC GAGCCTTTTT CTTTGGTTCC GCCTTGGTCC AACTTCCTCC CTGAAAGACT 6420 

TCTAGGAAAA GCTGGTAGTC TCTCTCAGGC GCAAATTGAT TGCCACGATT GGGTTTGAAA 6480 

ACACCTTTTT CCCAGAGCCA TTTTAGAAGT CGCTCGTCAA AGTTACTTTT ATTGACCTTG 6540 

ATTTTTTCCT TTTTCTGAGC TTTTCTGGTT AGATTTTCAA CCTTTCTGAG CAGTTTTTCT 6600 

TCCTCTTCCA ATTGCTGGTC AAGGGACAAT CGATGAAAAT GACGAACACA GTCGCTACCA 6660 

ATTGGAAAGA GGCGTTGGCC TGTGACACCG TTAAAGAGTT CATAAGCGTA TTTGATGGCA 6720 

TTTCCACAGA CACAATTGCT ACGGCCGATA CCGTTAAAAA TAAAGGAAAC TTCATTCCAT 6780 

TCCTTGGTAG CTTGTTCCCA AGTATCCGCT TTCGAAGCCT GTAAAACTGC ATCGTGCAGG 6840 

GATTTTCTAA CTGGAAGTGT CATGAGGTCT CCTTTCTAAT ACTCAATAAA AATCAAAGAG 6900 

CAAACTAGAA AGCTAGCCGC AATCAGCTCA AAACACTGTT TTGAGGTTGT AGATAGAACT 6960 

GACGAAGTCA GCtCAAAACA CTGTTTTGAG GTTGTGGATA GAACTGACGA AGTCAgTAAC 7020 

CATATATACA GCAAGGCGAA GCTGACGTGG TTTGAAGAGA TTTTCAAAGA GTATAAGTTA 7080 

TACTTTTACA ACTTGAACCT CGTCTTTACC GAGTAAAATC AAGTATTTTT CAATATTTTC 7140 

AATCGAATAG GCTCGTGATA AAGCCTCTTC GTATAGAGCT AACTGACCAC GATAGCGGTC 7200 

TACGAGTTGA CTTGGTTCAT CATAGCGGTC TGTCTTGTAG TCGAACAGAA CAATTTTGTT 7260 

TTCGTAAAGC AGATAGCCAT CAAGGATACC ACGGACAACA AAGTCTTCCT GACTCTTTTG * 7320 

GTCTCGTTTG AGCATGGAGA AAGGTTGCTC GCGATAAAGA TGGTCGGTAT TAGCAAGAAT 7380 

TTCCTGACCG AGTACTGTGT CAAAGAAAGC AAGAATTTTA TCAAGATTGA TCTTGTCTCT 7440 

GACAGCTTGG CTAGTTTGAA CTTGTTTGAG TGTTTCTGTT AGGCTAGCAA GGGTTAGTTG 7500 
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CTGGCTGAGG 


TCAATTCTCT 


GCATGAGTTC 


GTGAGTAGCA 


CTACCAATCT 


CAGCTCCAGT 


7560 


TACCTTTTCT 


TTGGTTGAAA 


AATCTGGCAA 


ATCGAAGCTG 


ATTTTCTTGC 


CTACTGACTG 


7620 


ACCTTGACCA 


GCAATCTCGA 


CACCTTCCAT 


ATCCATAACT 


GGTTCGTAGA 


ATTTCTTGAT 


7680 


TTGACTTGGG 


GTTTGAACAC 


TAGGAAGTTC 


AATAGCTGCG 


CGGTGAAGAG 


TATTATAAAC 


7740 


TTCCACCTCC 


TTCAGCATTT 


CCAGAGCTTC 


TTTGATGGTA 


TCTGACTGAC 


GATTGTCTGC 


7800 


TTGGGAGCTA 


TCTTGGAGAG 


GACTCTTGGT 


TTCCAACTCT 


CCGATAGCTT 


CTCTGGTCAA 


7860 


CTGATCTTCG 


CCAATAAAAC 


GATAACTAAA 


GTTGAGCTTG 


TCCTTAGTAA 


ACACTTTACT 


7920 


GATAGCCCAA 


AGCCAATCTT 


GGAAATTCCG 


TGCTTGCAGT 


CTAGTATTGC 


TATTTAGTTT 


7980 


CCCATTTTTG 


GCTGCTGGGT. ATTCCTTGGA 


TTCCAGCTTT 


TCACGAGAAC 


CCTTGCCGAC 


8040 


AAGATAGAGC 


TTTTTCTCAG 


CCCGCGTCAT 


AGCAACATAC 


AGCAAACGCA 


TCTGCTCAGA 


8100 


ATAGCTTGCT 


AtalJTGTAATT 


CCTCTTCGTT 


CTGCCTATAG 


GTCAGACTAG 


GAATGGAGAG 


8160 


TTTGATGGTT 


TTAGGATAGT 


GGTCTTCTAC 


TGCCCCTGTC 


TCCATCTTGG 


CAATATATTT 


8220 


GACACCAAGA 


CCATTCTGAC 


GACTGAGAAT 


GACTTCTGAC 


ATAGAGTCTT 


GCTTGTTGAA 


8280 


ATCTTGATCC 


ATATTGAGGA 


TAAAGACGTA 


AGGAAACTCC 


AGCCCTTTAC 


TCTTGTGGAT 


8340 


GGTCATGAGC 


TCTACTGCAT 


CTTTTGGCGG 


TGCGACGGCC 


ACGCTTGCCA 


AATCGTGCTG 


8400 


GGCTTCTAAG 


ACTTGGTCAA 


TCATACGAAT 


AAAACGCGAC 


AAACCTTTGA 


AATTGCTCTT 


8460 


TTCAAATTGA 


TCAGCACGCA 


GTGCTAGGGC 


ATAGAGATTG 


GCCTGCCTAG 


CAGGACCATT 


8520 


CGGCAAAGCC 


CCAACATAGT 


CATAATAAAA 


ACGGTCGTTG 


TAAATCTTCC 


AAATCAAGTC 


8590 


ATAGAGAGAG 


TGGGTTTTGG 


CATACAAGCG 


CCAAGAAGCT 


AGGATATCCA 


TGAATTGCTT 


8640 


TAGTTTTTCA 


GCTAGAGCTG 


TGTGAATCAA 


GCCTTTTTGA 


CTACTTGCCA 


TTTTTTGTGC 


8700 


ATTGACCAGT 


TTCTCATAGA 


GATTTTCGTG 






CAAGGGACAA 


8760 


ACGTGCTAGC 


TCATCCTCAT 


CAAAACCAAA 


CATTGGAGAC 


TTCATAAGGG 


CAACCAAGGC 


8820 


GTAGTCTTGC 


AGGGGATTGT 


GAATGACACG 


AAGAGTGTCT 


AGCATGACTT 


GCACTTCTAG 


8880 


GGATTGGAGA 


TAATTGTTTT 


GCTCTCCGTC 


AGTTTTGACA 


GGAATTCCGT 


ACTCAGACAG 


8940 


GGCGAGGAGA 


ATCTGGTCAT 


TACGACTGCG 


GCTGGAGGTC 


AGAAGGGCAA 


TTTCCTTAAA 


9000 


GGCAACACCT 


TTTTCTTGAT 


GAAGTTTCAG 


AATCTCCTTG 


ATAACTAAGC 


GCATTTCGCC . - 


9060 


TGTTAGTTTC 


GTTTCTGTTT 


GACTCTCTTC 


TTCCTCACCT 


GTATCGTCCT 


TGTCGTAGAG 


9120 


GAGAAATGCT 


GCCTTGTTGT 


CTGGATTGGG 


AGTCAGTTTG 


GTATTGGCAA 


AAACAAGCTG 


9180 


GTGCTTGTTA 


TCATAGTTGA 


TTTCGCCGAC 


CTCTTGGTCC 


ATGAGACGTT 


CAAAGACATC 


9240 
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ATTGGTTGCT GACAGCACTT CTGAACTACT ACGGAAATTT TCCTTGAGGA TAATGAGCCT 9300 

GCCTTCTTGG GGATTTTGCG CATAGCGTTG GAATTTCTCA TTGAAAATCT GCGGGTCTGC 9360 

CTGACGGAAA CGATAGATGG ATTGCTTGAT ATCTCCCACC ATAAAGCGAT TGTGGCCATT 9420 

AGACAACAAT TCCAGCATCC GTTCTTGAAT ATGGTTGGTA TCCTGATACT CATCGACCAT 9480 

GACTTCATGG AAGCGCTCCT GATAAGACTC ACGAACTTGT GGGAAATTCT CTAAAATCTC 9540 

AATGGTGTAA TGGCTGATAT CAGCGAATTC GAAGGCATTT TCCTGTCGTT TTCTCTGACG 9600 

ATAAGCCTCT ACAAAATCGC TCATGAAAGA TTGGAAGGTT TTAGCTAGTT TCCAAGTGTC 9660 

TCCATGATAA CGTTCTTGAT AGTCGAGAAT CGCTATCTGG TCTGATAATT GTCCTAGTTT 9720 

AGCAAACTGG GTCTTTCTCT CTTCGTTGTA GGCATCAGCC AGGGGCTTCA AATCAGCCTA 9780 

CGGCTGGCAT TAGTCAGAGC TCGACCGTTT TTCTCCTTAG AGATGGCGAC AACACGCGCA 9840 

AGCACTGCCT GATAAGCCTG ACTATCGGAC TCCTGATTTA GGGAGCCAAT TTCATCCAGA 9900 

ATTAACTGAA CATTTTCTAA ATAGGCAGCC TTTGCAAACT CCTTGGCATC GTTATCCAGA 9960 

TGGTAACGGA AAAAGCTTTC CAAATCCCAA AGGGCTTGTT TGATTTGCTC GGTCAGTTTT 10020 

TCTTTTTCAC TGGTAAAATC AGCTTTCTCA AATCCTTTGA GGAAAGATTC ACTCAGCCAC 10080 

TTTTGAGGAT TACTGGTGGA TTGGAGGAAG TCATAGATTT TATAGACCTG CTGGCGCAGA 10140 

CCCCGTTCGT CCTTGCCACG CCCAGCAAAG TTTTTCAGCA AATGACTAAA GGTCTCTTTC 10200 

TGTTTACCTT GGTAATGCGC TTCAAAGACC TCATGAAAGA CTTCGTTTTC GAGAATAAGT 10260 

TGCTCGCTTT GGTTTTGTAA AATACGGAAA TTAGGTGCAA TATCAAGCAG ATAACCATGT 10320 

TTGCCAAGGA ATTTTTGTGT GAAAGAATCC ATGGTTCCAA TGGCAGCGTT GGGTAGGTCT 10380 

GCCAACTGGC GACCCAAGTG TTGTTTGAGG TCGACATCAT CTGTTTCTTG GATTTTCTTG 10440 

CTGATTTTTT TCTCTAAACG TTCTTTAAGT TCAGTTGCAG CCTTGACGGT AAAGGTTGAG 10500 

ATAAAGAGTT GAGAAATTTC GACACCACGC GCCAATTGGT CCAGAATGCG CTCTGCCATG 10560 

ACAAAGGTCT TTCCAGAACC AGCCGATGCT GAGACCAGGA TATTCTGGGC AGAAGTGTAG 10620 

ATAGCTTCGA TTTGCTCGGC AGTTTTCTTC TGTTCCTTGC TCGAATTTGC TTCTGCTTCT 10680 

TGCAGTTTTT GAATCTCCTC CTCACTTAAA AAGGGAATAA GCTTCATCGA TTCAACTCCT 10740 

CTCTTATTTT TTCAAGCCAA GCTTGCTTGA GTTTTTCTCC GACCAGACGC TTGCCATCAG 10800 

CTAGGTCCAA CTTTTCTAGG AAACGGGCTT GGCCCAGATG GTAATTGGCT TCAAAGCCTG 10860 

TAATAGCCTG ATGTTGCTGG ACGTATGGGG CAATGCTTCT GCCATTTTCA GTATAAGGAT 10920 

TGATGGCGAA CCGGCCTGCT AAAATCTTCT CAGCAGCTTT CTTGTAAAGA TAGGCATTGT 10980 

AGTCCAGTAG GAGCTGAAAT TCCTCATCTG TCAGTTGATT AGCCTTGTTT TTGTTATAAA 11040 
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ATTCGCCTAA 


ATAACTGCTT 


TCTTTTTCCA 


AGAAGAGCCC 


TTGGTATTTC 


ATAGATTTGC 


11100 


TGGCTTCTAC 


CACTGCTCCT 


GCCAGACTTT 


TTACCGCCAT 


CAGAGATTGG 


ACAGGTTCAG 


11160 


CCATTTCCAA 


GTACATGGCG 


CCGAAAAAGT 


TCTGCTCCCC 


TTCTCTTTTT 


AGGGCAGCAA 


11220 


GATAGGTTGG 


TAACTGAGAA 


TTGAGCCCAT 


TAAAGAAATG 


AGGAAACTGG 


AACTGAGTCA 


11280 


GACTGGATTT 


GTAGTCTACT 


ACTCCTATCG 


CTCCATTAGC 


TTTCAAACGG 


TCAATCCGGT 


11340 


CCACCTTGCC 


TCGTACAAAG 


ACACTGCGTC 


CATTGTCTAA 


TTGAATAAAG 


GCTTGGTCTT 


11400 


TTCCACCAAA 


ATTTGCTTCT 


TCTTTGATGG 


TTTCGATGGC 


TGGATTGTGT 


CGGAGAATAT 


11460 


GTCCAGTTGT 


CCGTGCAACA 


TCAAGCAAAA 


CTTCCTTGGT 


AAACTGGGCT 


TCCAAACTTT 


11520 


CTTGATAAAT 


AGCTTCAAAT 


TCGCGTTCTT 


GACTGGTTTC 


TTGAATAGCT 


TGTTCTAGAC 


11580 


GTTGGTCAAA 


GGAATCTTCA 


TTAGGCAACT 


GTAAGGCGCG 


TTCAAAGATA 


CGATGCAAGA 


11640 


AATTCCCGTG 


ACTACGGGCA 


TCAGGATGCA 


AACGTAATTC 


CTCCTGCAAG 


CCTAAAACGT 


11700 


AGCGTAGGAA 


ATAACTGTAT 


TCATTGCGAT 


AAAACTCTGT 


CAAACCCGAC 


GTAGACAGGT 


11760 


AAAACTCCTG 


TTTGGCAGGA 


TAGAGAGCTT 


GCAAGCTGTC 


CTTGGCTAAG 


GTCTTGCTGC 


11820 


TTGGACTGGT 


TGGGATAGCT 


GGATTTTCCA 


GACCTTGCTG 


ATCTAGTTTT 


TTACCTATGA 


11880 


CACGCGACAG 


AACCTTGACA 


AAAGTCAAAT 


CTTGCTCAGT 


ATCGCTCATC 


TCACCCTGCT 


11940 


GGTGATAGGC 


AACCAGACTA 


GACAAAAGAC 


TGTGATAGGA 


CCCCATATCC 


TCCTTAGACA 


12000 


GTCCTTTGTG 


ATTCATCCTC 


TTCTCTCTCC 


GCCTAAATCC 


AAAATGGATC 


AACTCTTGAA 


12060 


GATAGGCAGA 


TTCCTTACTT 


TCACTTTCGT 


TAAAAAGGCT 


TGGAGCCGAC 


AAGAACAACT 


12120 


GCTTACGAGC 


AGAATTGACC 


AAGGAAAGCA 


TAGTGTAGCG 


ATTTTTCTTG 


AGATTTTCAC 


12180 


TGCTGGCAAT 


CAGTAATTGA 


ACGCCTTCTT 


CGGTCGCTTG 


GTTTAGGTTT 


TGCCTTTCTT 


12240 


CATCTGTCAG 


AAGACTGGTG 


TTTTGAGAAA TTTTTGGTAA ATTGTCCTGA GTTAGTCCAA 


12300 


TAGCATAGAC 


AAAGTCAGCA 


GTCAATGGTG 


CAATCAAATC 


GTAACTCTGC 


ACCAGAACAG 


12360 


TGTCCACTGT 


TGCTGGAATG 


GTACGGTATT 


GGGACAAACT 


CATTCCAGAA 


TGGAGCAAGG 


12420 


CTAGGAAGTC 


TTCCAGACTA 


ACCTGTGAAC 


CAGCAAAAAC 


AGTCGCAAAT 


TGTTCTAAAA 


12480 


CATGGCAGAA 


AGCCTTCCAA 


ACTTCGGCTT 


GTCTTTCCTG 


TTCTACAGCT 


TCCAAAGTGG 


12540 


TTGTCAAATC 


TTGTAACTGC 


TTGGTCACAG 


CTCCTTCTTT 


TAGAAAGACA 


CTCCATTTTT . 


12600 


GTAGGAGTTT 


TTCAGCCTTT 


TGTTTTCGGC 


TGGCAAAGAG 


GGTTTCAAGA 


GGTGCTAAAA 


12660 


TTCTCAGGCG 


GAGGACATTC 


AAACGCTCAA 


GATTAAATTT 


TCCATGGTGG 


GATTTGGTGA 


12720 


AGGTTTGCTG 


AAAGGCTGGC 


AAGCCATTGA 


TACCAAGATA 


GCGGATATAT 


TGCTCAAAAG 


12780 
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CATCAATATC AGACTGACTG AGGTCAGTAT ACAAATCAGT TCTAAGAAGA TTAATCAAAT 


12840 


CCTCCTGACG 


AAAACGGTAA 


CGTTTTAAAG CTAAAATAGA 


CTCGACAAAC 


TGAGTCAAGG 


12900 


GATGATGAGC 


CATGGCTTCG 


CTTCTACCAA 


GATAAAAAGG 


AATCTGATAC 


TGGTCAAAAA 


12960 


TGGTTTTGAG 


AGATAACTGG 


TAAGAAGCTA 


CATCCCCCAA 


GAGAATACGA 


AAATGCTTGT 


13020 


AGCTCAGGTC 


TGAGTTCTCA 


TGTAATTTCT 


GACGAATACT 


ACGGGCTACT 


AGGTCCAACT 


13080 


CCTCCTTTTG 


CGTCAAACAA 


GACCAGATTT 


GTAAATTTTC 


ACGGTCTTTC 


TCATCGACAT 


13140 


CCAAAGCGAG 


TTCTGAAAAG 


TCATAAGAAG 


ACTCCAACAA 


ACGAGAGGCC 


TTGTCAAAAC 


13200 


TATCCATCTT 


CTCATGAGTT 


TGAGAACAGT 


CCTGAGCAGG 


CGTTTGGTAT TTAGAAGCCA 


13260 


GATGATGGAG 


AAATTTT A CG 


CTGGCTTGGT 


AGAGATTGCC 


CTCGCTAAAA 


GGACTGGTAT 


13320 


AGGCTTTCTT 


ACTAGCATAA 


GCCCCGATAA 


CAATCTCAAC 


ACCTTTGCCG 


TGAAGTAAGT 


13380 


CCACAACCCG 


CTCTTCCTCA 


GCAGAAAAAC 


GAGTAAAGCC 


GTCAATGACC 


AAGGCGATTT 


13440 


GATTAAAATC 


ACTACTTACC 


TTGTCATTCT 


CAATAGCCTC 


AATCAAATGG 


GACAACTGAC 


13500 


TTTCCTGGGC 


TAACTGACCT 


TGATTAAGAT 


AGGCTGTTAC 


TTTCTCAAAA 


ATCAAGAGTA 


13560 


AATCCGCCCT 


CTTATCCTCA 


TCTGTTAAAT 


TCTCCAAGTC 


CAAAAAACTC 


ATCTGAGATT 


13620 


TGGTCATCTC ATGGTAAAGC TCAATTAACT GCTGGATCAA TTGAGGATCC TGCTTAATAG 


13680 


CGCCATAAAC 


ACGCAAGTCC 


TTGGGATCGA 


GTTCGGCAAG 


GCATTTGTAA 


AAGGCCAACC 


13740 


CAAGACCGAT 


ATCATCAAGA 


GTAGTTTTAG 


CTGGTAAATC 


ATTCAAGACC 


AGATAGCGAG 


13800 


CCATTTGAGC 


AAAGCGCGTG 


ACGGTAATCG 


AAAAAGAAGC 


CTGCTGGGAC 


AAGTATTCCA 


13860 


GCACGGCGCG 


TTCCTTTTCA 


AAAGAAAGAG 


AGTTGGGGGC 


AATGTAGAAG 


ACCCGCTTGC 


13920 


CAGCTGCAAC 


TAGCTCTTCT 


GCCTCTCTTG 


TTAGAATTTC 


TGTCAAAGAA 


GTCCGAATAT 


13980 


CAGTATAAAG 


TAATTTCATC 


TCAGCCTCGT 


TGGAATTTTT 


CATCACCCTA 


TATTATACCA 


14040 


TGATTAGCCT 


CGTAAATCTG 


TTAAAATATT 


TAGGCCATCC 


TTTCTTTTCT 


TCATCATCTG 


14100 


CTAAATCTTA AATACTTAGC 


TTTACTTGTA 


TTAGATAGAA 


TAAGTCTGGC 


TACTGAAAAT 


14160 


CACATAATAA 


AAAAGCCTCG 


GTAACAAGGC 


TTTGAGTTTT 


ATGATTGTTT 


CTTAGGTACG 


14220 


GAATACACTT 


CAATGTGTTG 


TCCCAGTATC 


TTAATGTCGA 


CTGGTAGATT 


GTCTGATTTA 


14280 


TCGCCATCAA 


CATCGGACTC 


TAATTCGATA 


TCAGAAGAAG 


TTTTAATATT 


ACGTGCCTTT 


14340 


ATATATTCAA 


TATTCTTGAT 


AGAATGATTG 


AACTATAGTA 


AATTGAAACT ATAATAGTAC 


14400 


ACCGTGGATG 


CTAAAATATT 


TCTAGAAATT 


AATTTGATTT 


CCCTAATCAA 


GCTATTCGTA 


14460 


TCTTATTTCA 


ATCTACTATA 


ATAAAATGAA 


CCAAAAATAG 


TACACAATGT 


GGTATAATCT 


14520 


TCTTATGGCA 


TATTCAATAG 


ATTTTCGTAA 


AAAAGTTCTC 


TCTTATTGTG 


AGCGAACAGG 


14580 
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TAGTATAACA GAAGCATCAC ACGTTTTCCA AATCTCACGT AATACCATTT ATGGCTGGTT 14640 

AAAGCTAAAA GAGAAAACAG GAGAGCTAAA CCACCAAGTA AAAGGAACAA AACCAAGAAA 14700 

AGTTGATAGA GATAGACTTA AAAACTATCT TACTGACAAT CCAGATGCTT ATTTGACTGA 14760 

AATAGCTTCT GACTTTGGCT GTCATCCAAC TACCATCCAC TATGCGCTCA AAGCTATGGG 14820 

CTACACTCGA AAAAAAGAAC CACACCTACT ATGAACAAGA CCCAGAAAAA GTAGCCTTAT 14880 

TTCTTAAGAA TTTTAATAGT TTAAAGCACC TAGCACCTGT TTAGATTGAC GAAACAGGAT 14940 

TCGATACTTA TTTTTATCGA GAATATGGTC GCTCATTAAA AGGTCAGTTA ATAAGAGGCA 15000 

AAGTATCTGG AAGAAGATAT CAGAGGATTT CTTTGGTTGC AGGTCTAACA AATGGTGAAT 15060 

TAATCGCTCC AATGACTTAC GAAGAGACGA TGACGAGCGA CTTTTTTGAA GCTTGGTTTC 15120 

AGAAGTTTCT CTTACCAACA TTAACCACAC CATCGGTTAT TATAGTAAAA TGAAATAAGA 15180 

ATAGGGGGGG GGGGGGAGGG GGGGGGAGGG AGA 15213 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6004 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

TTATTACCTG AAA C ATT AAA TTTAATTGGA CATCCCGTTA TCAATTTTAT AATATCATCA 60 

AGATTTTTAT TATCTGATTC AGGAATTTTA TCTGATATAA CAACACCATT TTCAAGATAG 12 0 

TTCATTAAAT TATTTGATTC ACTAACATTA GTGTTTTGAT CTCCATCAAG CCAAAAATAA 180 

TGGTTATCGG AATCTAAATA CGATGAGTTT AAAATATTAT TACAAATTAT TTGATTTGCT 240 

CCACCAGGAA TATATCTCAC TACTAAATTC TGTTTAAGAT TCTCACTACC TGAATGAGTG 300 

ATAACAAACT CTAGAATATA TTTAGCTAGT CTATCTTCAA CATAAATCAT CTTCCTAGAA 360 

TGATACACAT CACCTAATTC AAAAAATGCA TCCTGATAAT CAATATTTTC AATAACATCT 420 

ACCTTTTCTC CGTTTTTCAC TAAAAGTTTC ACGGCTTCTC TAGGAAAATC TTTTATAAGT 480 

TGTGTAGAAT GTGTAGTGAT AATAATTTGA TGTTTTTTAT TTAAACACTC TTGAAGTAAA. 540 

AACTCTTTAA ATTTATAGAT TGCACTCGGA TGAAGTGAGA TTTCAGGTTC ATCTATTAAT 600 

ATTAATGAAT TTGATTGCGC ATTTACTATA TCATTTACTA ACAAAATAAT TCTAGCCTCA 6 60 

CCTGTTCCTG CAAAAGCCTC GGAATATTCT TTTCCAGATT TTTTCATCCA AATAGTTTTG 720 
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GAAGCTTTTA 


TATCATCACC 


TTTTGAATAC 


AACTTATGTG 


TTAAAATTTG 


AATGTCTGTA 


780 


TAAGATTCAT 


CCATTATTTC 


ACTAATAATT 


TCACAAACTT 


TATCATCAAC 


TTTAACATTA 


840 


TCTATAACCA 


TTTCCTTTTT 


ATAACGCGTA 


TAGCTACTTG 


TATTATTCTT 


TAAAATATCA 


900 


GCAACTGGCT 


TAGATCGTAA 


TCTTATAAAA 


TCTTGTTTAC 


TACGTTGAGT 


AGAAATTTTT 


960 


TTAAAATTAT AGTGATAGAA AAATAAATCA AAAGCAGAAA CATATTCTTT 


ACAATCACAA 


1020 


AAGACAACAT 


TTTTTTCAAT 


GCCATCCCAT 


CTGTCTGTCG 


AAGAACTTCC 


AATATATTTA 


1080 


TTTTTGGGTA ATCTTTCCAT 


CTCATATTGT 


TTTTGAGGAG 


CATATGGTTC 


CCAATAATCT 


1140 


AATCCTTTTT 


TTGTTCCAGA 


ACGGCCTTTA 


AGAACTTCTA 


CATTTCTAGA 


AGCTTTAATG 


1200 


TTATAATATG 


AATAGATTAA 


ACATTGTTTC 


CCATCCACTT 


CATCTATTTG 


ATCAACATTT 


1260 


GTACTAAACC 


AATATTCAGA 


CACACTTTTA 


TTGGCTGGAG 


AACCATATAA 


AGCTTGTAAA 


1320 


ATTGAAGTTT 


TATTTACTCC 


ATATCTATTA 


CAGACACCTC 


AGGATTATTT 


AACTTATAAG 


1380 


TTTTAACAGC 


TACGGAATCA 


ATTTCAACAG 


CAACTTGAAC 


ATCTATGCCT 


GATTTTTTAA 


1440 


GGCCACTTGT 


AGTGCCACCT 


GCACCGTTAA 


ATAAATCAAT 


AGCAACAATT 


TTCCCCATAG 


1500 


TATTCTCCTA 


AAGTTTCTCC 


TTTTTATTAT 


AACATTATCA 


AATGTAAAAC 


CCAACCCGAT 


1560 


AGGGTTAGGT 


TTTTAACATC 


ATTTCACCAA 


CTTCTTCATC 


TCATCAATAC 


GTGCGACGGT 


1620 


CGCGTCATAT 


TTAGCTTGGT 


AGTCAGCTTG 


TTTGTCGCAT 


TCTTTTTGGA 


CGACTTCTGG 


1680 


TTTGGCGTTG 


GCTACGAAGC 


GTTCGTTAGA 


GAGTTTCTTA 


CCAACCATGT 


CCAGTTCTTT 


1740 


TTGCCATTTA 


GCAAGTTCCT 


TGTCGAGACG 


GGCCAGTTCT 


TCTTCAACAT 


TGAGGAGATC 


1800 


GGCCAGTGGC 


AGGTAGATTT 


CTGCTCCTGT 


GATGACACTT 


GACATAGCCA 


GTTCAGGTGC 


1860 


AGGGATGGTT 


GATGCGATTT 


CCAAGTGTTC 


TGGATTTGTA 


AAGCGTTTGA 


TATAGTTGAC 


1920 


ATTGCTGTTA 


AAGAAGGCTT 


CCAAGTCGCT 


ATCGCTTGTC 


TTAACAAGGA 


TGGTGATAGG 


1980 


CTTGCTTGGT 


GCTACATTTA 


CTTCCGCACG 


CGCATTCCGA 


ACAGCACGAA 


TCAAGTCTTT 


2040 


GAGACTTTCC 


ACACCAGTGT 


GAGCCGCAAG 


GTCTTCAAAG 


GCTAGATTAA 


CAGTTGGGTA 


2100 


TGCAGCTGTC 


ACGATAGAAC 


CTTCTGAGAT 


TTGTCCAAAG 


ATTTCCTCTG 


TCACGAATGG 


2160 


CATGATTGGG 


TGAAGGAGAC 


GAAGGATCTT 


GTCCAGCGTA 


TAGAGGAGAA 


CAGATCGAGT 


2220 


AATGACCTTA 


TCGTCTTCAT 


TGTCGCTGTA 


TAGAACTTCC 


TTGGTCAACT 


CAACATACCA 


2280 


GTTGGCAAAT 


TCTTCCCAGA 


TGAAGTTGTA 


AAGGATATGA 


CCAGCCACAC 


CAAACTCGAA" 


2340 


CTTATCAAAG 


TTTTCAGTAA 


CTTTTGCAAT 


GGTTTCGTTG 


AGATTGTGGA 


GAATCCAGCG 


2400 


GTCCGTCACA 


TTACCAGCCT 


CACCTGTTGC 


AACTTTTGTG 


ACATTGTCAT 


GCGCCACATC 


2460 


CAGCGTCAAA 


CCTTCATTGT 


TCATGAGGAT 


ATAGCGAGAA 


ATGTTCCAAA 


TTTTGTTAAT 


2520 
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AAAGTTCCAT 


GAAGCATCCA 


TTTTCTCGTA 


AGAGAAACGA 


ACGTCTTGAC 


CTGGTGCGGA 


2580 


ACCGTTTGAA 


AGGAACCAAC 


GAAGGGCATC 


AGCACCGTAT 


TTCTCGATGA 


CATCCATTGG 


2640 


GTCAATCCCG 


TTACCGAGAG 


ATTTAGACAT 


CTTGCGTCCT 


TGCTCGTCAC 


GGATGAGACC 


2700 


GTGGATAAGC 


ACGTTTTGGA 


ATGGCTGACG 


ACCAGTAAAT 


TCCAAGGACT 


GGAAGATCAT 


2760 


ACGAGACACC 


CAGAAGAAGA 


TGATGTCGTA 


ACCTGTTACC 


AAGGTTGAAG 


TTGGGAAATA 


2820 


ACGTTTAAAG 


TCTTCTGAGT 


CGACTTCAGG 


CCAGCCCATG 


GTTGAAAATG 


GCCAGAGGGC 


2880 


AGAACTGAAC 


CAAGTATCCA 


AGACGTCTTC 


GTCCTGAGTC 


CATCCGTCAC 


CTTCTGGAGC 


2940 


TTCTTCGCCG 


ACATACATTT 


CACCATCAGC 


ATTGTACCAG 


GCAGGGATTT 


GGTGACCCCA 


3000 


CCAAAGCTGA 


CGAGAGATAA 


CCCAGTCGTG 


GACATTTTCC 


ATCCATTGAA 


GGAAGGTATC 


3060 


GTTGAAACGA 


GGTGGGTAGA 


ATTCGACCTT 


GTCCTCTGTG 


TCTTGGTTAG 


CAATGGCGTT 


3120 


CTTAGCCAAT 


TGGTCCATCT 


TGACGAACCA 


TTGAGTAGAC 


AAGCGTGGCT 


CAACTACGAC 


3180 


ACCTGTACGT 


TCTGAGTGAC 


CAACACTGTG 


GACACGTTTT 


TCGATTTTGA 


CAAGGGCACC 


3240 


GATTTCTTCC 


AACTTAGCAA 


CGACTGCCTT 


ACGAGCTTCA 


AAACGATCCA 


TGCCTGAAAA 


3300 


TTCAAAGGCA 


AGCTCATTCA 


TAGTTCCGTC 


GTCGTTCATG 


ACGTTGACTT 


GTGGCAAGTT 


3360 


ATGACGTTGG 


CCAACCAAGA 


AGTCATTTGG 


ATCGTGGGCA 


GGTGTGATTT 


TCACGACACC 


3420 


AGTACCAAGC 


TCAGGATCTG 


CGTGCTCATC 


TCCAACGATT 


GGGATGAGTT 


T ATT AG CG AT 


3480 


TGGAAGGATG 


ACGTTTTTAC 


CAATCAAGTC 


CTTGTAGCGC 


GGGTCTTCTG 


GATTAACCGC 


3540 


AACCGCAACG 


TCCCCAAACA 


TAGTCTCAGG 


ACGAGTTGTA 


GCAACTTCAA 


GGGCGCGTGA 


3600 


ACCATCTTCC 


AGCATGTAAT 


TCATGTGGTA 


GAAGGCACCT 


TCTACATCCT 


TGTGAATCAC 


3660 


CTCAATATCA 


GAAAGGGCTG 


TGCGAGCTGC 


TGGGTCCCAG 


TTGATGATAA ACTCACCACG 


3720 


ATAGATCCAG 


CCTTTCTTGT 


AAAGGTTCAC 


AAAGACCTTA 


CGAACAGCTT 


TTGACAAACC 


3780 


TTCATCAAGA 


GTGAAACGCT 


CACGAGAATA 


GTCTACAGAA AGCCCCATCT 


TGCCCCATTG 


3840 


TTCCTTGATG 


GTAGTGGCAT 


ATTCGTCTTT 


CCATTCCCAG 


ACCTTCGTCA AGAAAGACTC 


3900 


ACGACCTAGG 


TCATAACGCG 


TAATACCCTC 


ACCACGTAAG 


CGCTCCTCAA 


CCTTAGCCTG 


3960 


AGTCGCAATA 


CCAGCGTGGT 


CCATACCTGG 


AAGCCAAAGG 


GTATCAAAGC 


CTTGCATGCG 


4020 


TTTTTGACGG 


ATGATGATAT 


CCTGCAAAGT 


CGTATCCCAA 


GCGTGACCAA 


GGTGAAGTTT 


4080 


CCCAGTTACG 


TTTGGTGGTG 


GAATCACGAT 


TGAATAAGGC 


TTAGCCTTTT 


GATCGCCTGA 


4140 


AGGCTTGAAA 


ACATCCGCAT 


CAAGCCATTT 


TTGGTAACGA 


CCAGCCTCAA 


CCTCGGCTGG 


4200 


ATTGTATTTA 


GGTGAAAGTT 


CTTTAGACAT 


GTGTGTGTCC 


TTTCTCTATT 


TTGTTTATTT 


4260 
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TATTTTGAAT 


TTGCTTAGCA 


GCTTCTTCTG 


CAGACAAATT 


CGTATTATTT 


ATTTTAAAGT 


4320 


AGTGGTGCAA 


CTCATTCGGT 


TGATGTTGGG 


AATTTAATTG 


AAGTGTTTCA 


GCGGTCTCTA 


4380 


AAATTTCTCT 


TTCAGATACC 


TCAATATGTC 


GTTTTAAGGG 


TTTGTGCTTT 


AATCGATTCT 


4440 


CCGTTCGATT 


TCGACGTATG 


CACTCTTCAA 


GACTTGTTTC 


CAATTCAACA 


AACAGAATCT 


4500 


CTTGATGAAA GTTATCCAAT 


AAATCCTGAA 


TTTGCTTTAA 


ATACATCAGC 


TGGTACTGAT 


4560 


TTGAAAAATC 


AATTACGTCT 


GTTAAAATTA 


CTGATCGCTG 


ATTTCTTGCA 


CTTGCTCCAA 


4620 


GGAAAGAAAA 


GGTAATTCCA 


CGAACAAATT 


CCCACATCTC 


CTCGGTATAA 


TCCTGATAGA 


4680 


TCTCTAGTGC 


AAAATCAATG 


GCTTGATGGT 


TATAAAATAG 


GGTAGCATCC 


GTCAGTCGAG 


4740 


ATAATTCTTG 


ACCAATGGTC 


ATTTTTCCTG 


ATGCTGGAGC 


ACCAATGATG 


AAAAGATGCA 


4800 


TCAAATCACC 


TCCCACTCAC 


TCCTCAGCAA 


GCCATATCTC 


AAATCATCAC 


AGCAGTTGCC 


4860 


TTGAGCATCT 


TTGCGGTCTC 


TTATGCGAGC 


TTCGAGGGTA 


AAGCCAAGCT 


TTTCCGAGAC 


4920 


TCGTTGACTT 


TGAAGGTTAT 


ATCCAAAGCA 


AGTTAGTTCA 


ATCTTGTGAA 


GACCAAGTTC 


4980 


TTTAAAAGCT 


AGATCAATCA 


AGGAACACGC 


TGCTTCTGGA 


ACATAACCTC 


GACCCCAATA 


5040 


GTCTGGGTGC 


AAGGTATAGC 


CAAGCTCTAG 


CACATCATCC 


GCATGAAGAT 


GGTTGAAGTC 


5100 


AACAGAACCA 


ATGACTTTAT 


CGGTTCCTTT 


GACGACAATC 


CCATAGCCAG 


CTGGGAGATT 


5160 


TTCCTTTTGA 


GTACGCTCCG 


GAAGAATGTG 


CTCCAGATAA 


TAAATCTCAT 


CTTCCAAGAT 


5220 


CTTGACTGGA 


GGAAAACCTG 


CTGGATAGGC 


GACCTCTGGC 


AAACTAGCGT 


AGGTATGGAT 


5280 


ATCCTCAGCA 


TCCACCACTG 


TGCGGACTCG 


TAAAACGAGA 


CGTTCTGTTT 


CGATTTTATC 


5340 


TGGCAGCTCA 


GTTCTTGCCA 


TCCTTCTTCC 


TCGCTTTTTT 


GATGAAACTG 


CCCTTCATAT 


5400 


CTACACGCTT 


GTCCAGATAG 


CGATAAACGC 


GCTGATATCC 


ATCTCCCATG 


AAATAGGTTG 


5460 


GGGCAAACAG 


TTGATTTTTA 


AAATGTCCCT 


TTTCATCCAG 


GAGTTCTGGG 


GCAACAAGTC 


5520 


GCTCAAGAAT 


CTTGGCAAAG 


ATGTGGCAAA 


TACCGTCTTC 


CTCAACAATC 


CTATCTACCC 


5580 


GACAATCTAA 


AACAAGTGGA 


CAGGCGTCTA 


AAATAGGAGT 


CTGAGTTCGT 


TCAGAAATrT 


5640 


CATAATGCAC 


TCCCAAACGT 


TCCAATTTCT 


CCTGATGACT 


GATAAAACCA 


GCCTGCTCCA 


5700 


TCGCAAGCAT 


AGAAGTTTCA 


TCAGAAATAT 


TCACAGTAAA 


TTTTTGATAC 


TGTTTGATCT 


5760 


GCTCTGCGGC 


ATTCTCTCTC 


GCAACGACTC 


CAATCACAAC 


CCAATCTCCT 


AGACTATAAG 


5820 


AGGAACTACA 


GGTCGTGATG 


TTATAGCCAA 


AATTCTAATC 


TTGATATCCT 


AAAATAAAAA 


5880 


CAGGAAAACC 


ATAATATAGT 


TTACTTGTGT 


TAAAAGATTG 


CTTCATAACA 


ACCCCCTTTG 


5940 


ACTAAGACGT 


AAAAGAAAAG 


CCCTGCCATC 


TACATGACAG 


GGACGAATGT 


GTTTATCCGC 


6000 


GGGG 












6004 
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(2) INFORMATION FOR SEQ ID NO: 28: 

U} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5857 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
{D) TOPOLOGY: linear 



fvi i 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 28: 






1 \j 1 AtxftATTC 


ACGACAATGC 


TTCGTTGATT 


TCTGGGTTGA TTTCGTCGCG TTCTGGCAAG 


60 


CvjAoTCAATG 


AACCAAAAAT AGTACACAAT GTGGTATAAT CCTTTTATGG 


CATATTCAAT 


120 


ACjAi l i i CGT 


AAAAAAGTTC 




1 LxMlaUurAAIJA litj I Ala IATAA 


CAGAAGCATC 


180 


ACACGTTTTC 


CAAATCTCAC 


*J J. .nn. ± -/"iV—* — r\ 1 


1 1Vj*jf i'TAAAVjCTAA 


AAGAGAAAAC 


240 




AACCACCAAG 


TAAAAGGAAC 


<uwwiv>V.AnuA iuuUj JL 1 \jA 1 A 


GAGATAGACT 


300 


TAAAAACTAT 


CTTACTGACA 


ATCCAGATGC 


****** x \3t\t\r\ I nuv. 1 X 


CTGACTTTGG 


360 


CTGTCATCCA 


ACTACCATCC 


ACTATGCGCT 


CAAAGCTATG GGCTACACTC 


uAAAAAAOAA 


420 


CCACACCTAC 


TATGAACAAG 


ACCCAGAAAA 


AGTAGCCTTA TTTCTTAAGA 


ATTTTAATAG 


480 


TTTAAAGCAC 


CTAACACCTG 


TTTAGATTGA 


CGAAACAGGA TTCGATACTT 


ATTTTTATCG 


540 


AGAATATGGT 


CGCTCATTAA 


AAGGTCAGTT 


AATAAGAGGC AAAGTATCTG 


GAAGAAGATA 


600 


TCAGAGGATT 


TCTTTGGTTG 


CAGGTCTAAC 


AAATGGTGAG TTAATCGCTC 


CAATGACTTA 


660 


CGAAGAGACG 


ATGACGAGCG 


ACTTTTTTGA 


AGCTTGGTTT CAGAAGTTTC 


TCTTACCAAC 


720 


ATTAACCACA 


CCATCGGTTA 


TTATTATGGA 


TAATGCAAGA TTCCATAGAA 


TGGGGAAGCT 


780 


AGAACTCTTG 


TGTGAAGAGT 


TTGGGTATAA 


ACTTTTACCT CTTCCTCCCT 


ACTCACCTGA 


840 


GTACAATCCT 


ATTGAGAAAA 


CATGGGCTCA 


TATCAAAAAG CACCTCAAAA 


AGGTATTACC 


900 


AAGTTGCAAT 


ACCTTTTATG 


AGGCTTTTTT 


GTCTTGTTCT TGTTTCAATT 


GACTATATAA 


960 


ATTGTCTAAG 


CGAAACAACC 


GATAAGAATT 


GGCACAAAAG CGACCGTATT 


TTTGTTACCA 


1020 


ATACAGGAAA 


AACAGTTCAT 


AGTTCTATCT 


TGAGCAAGTC TCTCCAGCGA 


GCAAACGAAC 


1080 


GCCTTAAAAA ACCAATTCCC 


AAACATCTGT 


CCCCTCACAT CTTCAGACAC 


ACCACTATTA 


1140 


GCATCTTATC 


AGAAAATAAA 


ATTCCTTTAA 


AAACAATCAC GGACAGGGTT 


GGTCATCCCG 


1200 


ACTCTGAAGT 


CACTACTTCC 


ATCTACACCC 


ACGTCACAAA GAACATGAAA 


GATGAAGCAA 


1260 


TCAATGTACT 


GGATAAAGTT 


ATGAAAAAGA 


TTTTTTAAAA AGTTTTGTCC 


CTTTTTTGCC 


1320 


CTCTAAATAC 


AAAAATAGCC 


CTTCGGATAA 


AATCCGAGGG GCTAGAAACG 


TTGTTAAATC 


1380 
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AACGGCCGAA CTTTTGAATT TCATGGTTCG GGATAAAATA GTTCACTGAA CTATTTTATT 1440 

TTTTAAGGTT ATCATAATAT CAAATAGTTC AATTAAATAC GCTAAATTAC TAATATACTT 1500 

TTTACCTTTT TCATTCTAAA ATGTAAAGTA CAAACAATTA CAATATACTA GAGGGGGAGT 1560 

AAAAAAGGTA TTAAATCGAT GAGTTCAGCA GGCAAGAAAA TAGCACCTTT ACGGGTGCTA 1620 

TTTTTTAATT AACGCCACGT TAACTTTTGA TTGATGAATT TTATTGTTTG GCACTTCTTT 1680 

CATTTCACGG TAAACATCGA TGAAATTCTT TCCAACATTA TTTTTGGAGT TAACTGCATT 1740 

TATTTTTGTA TTAATAACTT TTTTAGTATC GAAAGAATGG TTTAAGAAAT CCATAACTAA 1800 

CTCTCCTTTC TCATCCTGTA ATCAAGATTT TTATCAATGT CAAAATAGTA TTTTCTATCA I860 

ATCCAAATTG GTCCTTCTCC TTTAGAAATA GCAAGTACAT CTACCGGACC TCCTACTGTT 1920 

TCAAGAGTGT TGACAATTTT TCTCTTAAAT GAAGTTAATT CAATAAATGT TTTAGCTGTA 1980 

CTCGCCATTT CATTAAGTGG TTGCATTCCA ATAAGGTCTA TTATAGGATT TATATAATAT 2040 

TTTTGCTGTA TAGATGATAT ATTTTCAAAT ATATTCTCAA TTTCATCACC CAATCCATTT 2100 

TTCTCCATAA CTGATGATAC TTGCTCTGCG ATATATACAT TTAAGTTAGG ATCTATACCA 2160 

TTCATAATCG TCTCAACCAT CTCTGACTGT GCAAAAGGGA TTATATGACA AGTTTTATGA 2220 

TGATTTATCA CACTTTCATT AATAACTTTC CAAATTAATC GTTTAGAAAA AATTCCATAT 2280 

AATTCAATTT GTCTTATAGA TGGAAATATC TCGTCTGTAC CATAACCTGC TATAACTAAT 2340 

CCAGTTATGT TTGTTGAGTC ATATCCAATG AAAATCGCTT TATATAAAGA TTTAGCAATA 2400 

ACTTCAACCT CATCATCAGT ATGAGGAAAG GATTTAAAAA CATCGTCTAC AATGCTTTTT 2460 

ATTAACTCTA ACTCAGCTTC AAAAAATTCA AAATTACTTT CAGCTTCTAC TTTTGAAATT 2520 

TCTAAACTAA AATTAGTTAT AGCATTTAAT AAAATTTTAT TAAAATCATC TAGAGTGATG 2580 

GTTTCACCAT TAGAAACTCT TAAATCAGCT GTTTCTTGCG CTTCATAGGC AATGCTGTCC 2640 

AAAATACTTC TTGTACTTCT GACAATATAA TTTCTTAATA AATCCTCAAC TTGTAGATGT 2700 

TTAAAGGAAA TTAAAAATTC TATTAGCTTT TCAACGTATT GGGCAGTATT ATCTAATAAA 2760 

TCTGTGCCAA TAGCCTGCTT AAACTCATTT AAAATTACCT CCCACGGAAT TTCCATAAAC 2820 

GAAGCGTTCC CATATATCAT GATCCCCACG GAATGTTCTT TTGATAAAGT GAATAATTTT 2880 

CGGGCGCTAT TAAAAACTTT TGAATTTTTC CCGTCTGATA AGGTTACAGC GCTATCAGAA 2940 

GCCAATACAA CACCATTTTT ATTTAATATT CCAATTTCTG CTGTCAAAAT ATCACCTAAA 3000 

CTTTCTAAAC CTGCTCATGC TCTAATGGTA CAACAGCTAA GGTCTTACCA AGACTTGCCA 3060 

ACACTTTTAA TACTGTATCA AGTTGTGGGC TTGTCTTTCC TGTTTCCATT CTAGCGATAA 3120 

CTGGCTGACT AACACCGCTC ATCTCCTCTA GTTTCTTCTG ACTAATACCC TTTTCATTTC 3180 
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TAGCCTCGAT AAGCTCACTC ATGATAGCCA CGCGCATATC ACTTTCCAAA ATTTCCTCTT 3240 

TGCTGAATAA TTCAGCTCTT ACATCTTTCC AGTTACTACC AATAGCATTA TTTTTCATTG 3300 

TCTAAACCTC TTTCTTTTAA ATCTGCAAGT TCACGTTTAG CTTGCTCAAT CTCTCTTTTG 3360 

GGTGTTTTCT GTGTCCTTTT CATAAAATGA TGCAGTAAAA CAAAACTACC ATCCATCCAA 3420 

GCAACAAATA AAATTCTATC TCTAAGTGGT CTCAGCTCCC AAATTTCAGC ATCTAAATGC 3480 

TTAATATATG GTTCGCCTGC GCGTGTTCCA TGTTGGCTTA ACAACTCAAT ATAATCATTA 3540 

ATTTTATTAA GCTTAATTCT GCTATCTTTC CCTTTTTTAC TGGTAAGCTC TCGCATATAA 3600 

TCAAAAACAG GCTCATTGCC GTTTTTATCC TTGTAAAAAT AGATATTATG CACTATTAAC 3660 

ACCTCTTCCT AATAACAATT ATAACCTAAA AGTTATTGTT TGTAAATACT TTTAAGTTAT 3720 

TAAAATAAAA AGCACCTAGT TTCCTAGATG CTAGCACAAT GACACGGATT CGCACCGTGG 3780 

CTACCTCTAT CAAGGTGTAC TCCTTCTATA CTATCCCTTG TGCTTTAGAA TATTATACCA 3840 

CACAATCAAC TAGATACCTA CCATCTCATG ATATACCCCC ATTTTGGGCA AGGGTACAAC 3900 

GCTAAAATAC AAATCAGAAT AGATATTAAA CCACTTATTT AACTTATCAT AAGCTGGTGA 3960 

TTGACTGATA AATAATATCC GCTGACAAGC TCCGATAACA TTCATGTGAT TGTACACATA 4020 

AACCTCTTTT ACAGCCTCTA AAATGTCAGC CTCACTTGTT TGTACCCTAA TATCTGTTAT 4080 

CTGCTTGATA GTTGCGTATT TTTGATAAGC TAGCATATCT TGATTTTTAG CAGCATCAAA 4140 

CATTTTACGC TCAAGGACAC TATACTTAGG TTGTTCTTTA TCTCGCATGA AATACCACTT 4200 

GAGCCATAAA ATCTTTTCTC GGTGTATTAC AGAAATACGC TCAATTTTCT TCTTTGTCAT 4260 

TGCTACCTCC TAAATCATCA ATTTAACAAT TCTAACCACT CACTTTTAGA AATAGTTGCA 4320 

TAGATCTTGT TCGATGTATG ATACAAAGGT TCTAAATCTT TTTCCACCCT AATATAGTTC 4380 

ATCTTATCCT CATGAGTAGG AAAGTATAGT ATTTCCGTTT CATCCTCGTT TAGGATACGA 4440 

TTGCACCAAT CATCAATAAT AACTGGCACT TCCCACTCAC GCCATTTTTT AAGGTTTTCT 4500 

AAAAGTTCAT TATCACTAAA TAGCTCGCCA TCTATTTGGA AAAATTCCCC TAAGTCATTG 4560 

TTTCCTTCAA CAATAATAAA CTCTGGCATA TTTCTATTAC TTAATAACTC CTTGAGTTCT 4620 

TGTAACTCTT TGATTTCCTT TAGATACTTC CTCAATTTCC AACCTCAATT CTTCAATCTG 4680 

CCTTACTACT CCAAAAATTT CATGGGTCTT ATAAGATTGT TCAAGTATAG CCTTTGCTGC 4740 

TTGAGTTCTT ATAAACGGGT TGACCTTACT GTCCATCATA ATATCATTGA GTACAGAAAC 4800 

AGCGTTAGAT GATGCTAAAT AAAGCATTTG AGTTGTTTTA TCCATCATCT CATCTTGCTT 4860 

TATCCTCAAT GTCTTTTTAA CCGCTGCAAC TTTTAGATAC TTATGACCTG TTGCGCGTGA 4920 
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TACCCCTGCT TTTTGACATG CTTTGTCTAT CGTTGGCTCG GTAAGCATGG CATCTATGAA 4980 

TTTAATTTGC TTGGACGTAA GGTTATCATT TTCATTTCCT GCCATCTATT ACCTCCTCAT 5040 

TATCAAAATA AAGGGTTGCC CCTTTATTTC CCTATGCTAG ATAATTCTGC AATTCTGCAT 5100 

CCATTGCCTC TGAATTGCCC TCAACAATCA TTTCATGCTG TACTAAATCA ATCTTATCTC 5160 

CGTTAATAAG TAAACCACCG TGGAAATAAT CAATTTTTCT ATCAAGGAAA TGTACTAGCT 5220 

TTTCAAGGCG TTGCTGTTGG CTGAATTGCT CCATGTCAAT TTCGATATAA GCAAGGGTAG 5280 

TATCATTATC CATAATATCT TCTAATTTTC TAAGAGCTAG AGGTTTATTT TTATATTTTT 5340 

CTAGGTATTC TCTCATTTCT GCCACTGTTA ATTTGATACT AGATAATAAA CTTAGTTCAG 5400 

CTGCATCATC TGCTGTAATA GGCTCTTCTT TTGATTCATG GTTTGCTAGT TCAGCATTTT 5460 

TCTCTTTTTC TAGTTGCTGA TACAATAGCT GAGCAGTATT TTGGGAATAG TTTTCGCCCT 5520 

CTTTTTTATA TTTTAAAAGT TCTTGCTCTG CATACACTTT CCCGATAATC ACTTCCTTAT 5580 

AAACTAATTG CCCATCTTGA GCTTTTAGCT TAATACTCCC ATGCTCTGGA ATTTCAATAT 5640 

ACTTAATTAT ACCATTTTTT GAGTATAAAA CAAAGCCTTT CTCCATCATT TTTAATAATT 5700 

TATCATCCTT GTTTTCAGTC ATGCTTTTCT CCTTTATTTC ATTTTATTAT AATCTGAATA 5760 

CCCCTAGTCT ATTTATTTCA CTAGGTTTTT AGGGTTCGTA TGCTAAAATA CTACCCTTTT 5820 

TGTGTACCTT ATGGCTGACT TTTCAAATTG GTTAGTT 5957 
(2) INFORMATION FOR SEQ ID NO: 29: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10254 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

AAAATGATAG CAGGAGAGTT TTCCCGTCCA TCAGACCCAG AACTGAGAGC CTTAGCTCAG 60 

GCTTCTCGCC AAAAACAGGC CGCCTTTAAC AAGGAAGAGA ACCCCTTGAA GGGAGCCGAA 120 

ATCATCAAGA CTTGGTTTGC CTCAACCGGG AAAAATCTTT ACATCAACAC TCGCTTGATG 180 

GTGGACTACG GTGTCAACAT CCATCTAGGG GAAAATTTTT ATTCTAATTG GAACTTGACC 240 

ATGCTGGATA TCTGTCCCAT TCGTATCGGG GACAATGCTA TGATTGGTCC TAATTGTCAG " 300 

TTTTTGACAC CCCTCCATCC ACTAGATCCA CAGGAACGCA ATTCAGGTAT CGAGTACGGA 360 

AAGCCTATCA CAATCGGAGA TAATTTCTGG ACTGGTGGTG GCGTCATTGT CCTTCCTGGA 420 

GTGACACTGG GAAATAATGT CGTTGCAGGA GCAGGGGCAG TAATTACCAA ATCTTTTGGC 480 
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GACAACGTTC: 


x V.V. 1 HuV, luu 


LAA i \_V_ 1 \jV_o 




AGGAAATACC 


TGTTAAATAG 


540 


AAGTAAAAAft 




uu i J vj i J 1L 1 


X 1 i i lliTAGG 


TTTCATCATT 


TTTTACCCAG 


600 


1 X V.nV.n 1 i. X f\ 




CTCTTAGCAA 


GTCTGTTTCA 


TTAAGCAAGT 


TCAAAGCATC 


660 


X w 1 AnU XUV? 


oAlvl 1 X 1 XV. 


TCCTCAGTTC 


ATCAGCTTCC 


TCCTTGACAC 


TCGGTCAGAT 


720 


i 1 1 v7f\ 1 AlwAA 


1 Alt 1 ALAAAA 


1 J. AvjAdUAub 


CAGGCTATGA 


TTCAGAAACA 


TGCGATTCCT 


780 


rtX X i 1 AuAu i 


x IVxAJ/uAUAA 


TCCTUAGGCG 


GTTATCATGC 


CCAATCACGA 


GGGGCTGGAC 


840 






TQT1 TATuC-A 


TTTTTAGGTG 


AGGAGATTGA 


CCGCTATGCG 


900 


AGGGAAfiTAfi 


uuuLuAAt I vj 


lui luuLbAA 


TTTGTTTCTG 


CCACCAAGAC 


CTATCCAGTT 


960 


X n. 1 V3 X V.VJ iun 


Av. 1 ALAAvajA 


v- bAGGAGGTC 


TGTCTGGCTC 


AGGCTCCTGT 


TGGCTCCGCT 


1020 




Aui rTATGviA 


TTGGTTGATT 


GGCTATGGTG 


TGGAGCAGAT TATCTCTACT 


1080 


uuwiLL lv» I u 


GTGTCCTAGC 


TGAT A r AG AG 


GAAAATGCCT 


TTCTAGTCCC 


TGTTCGCGCT 


1140 


CTGCGAGATG 


AAGGAGCCAG 


TTACCACTAT 


GTGGCACCTT 


GTCGTTATAT 


GGAAATGCAG 


1200 


CCAGAGGCTA 


TTGCTGCTAT 


lAjAvaCiAAvaTT 


TTGGAAGACA 


GAGGGATTCC 


TTATGAAGAA 


1260 


GTCATGACCT 


GGACGACAGA 


CGGTTTTTAC 


CGAGAAACGG 


CTGAAAAGGT 


GGCTTATCGT 


1320 


AAGGAAGAAG 


GCTGTGCTGT 


TGTGGAGATG 


GAGTGTTCTG 


CTCTTGCGGC 


AGTAGCTCAA 


1380 


TTGCGTGGGG 


TTCTCTGGGG 


TGAATTGTTG 


TTCACAGCAG 


ATTCTCTAGC 


GGACTTGGAC 


1440 


CAGTACGACA 


GTCGTGACTG 


GGGCTCGGAA 


GCTTTTAATA 


AGGCGCTAGA 


ACTGAGTTTA 


1500 


GCAAGTGTTC 


ACCACCTTTA 


GTTGTACTGG 


CAAAGGATTT 


GTTTTATCAT 


AAAATGTCTA 


1560 


GCTCATACTT 


TTCAAAAATA 


TGTTTAAACG 


AGGTCACCTT 


CCTCTTGTCC 


TAGGCATGTT 


1620 


GAGGTTGGGA AAAATCTTTA 


AAATCAGAAA 


AACGTATCAT 


ATCAGGTGAT 


GAAAACTTTG 


1680 


ACACTATGCG 


TTTTATGTCG 


ATAAGATTTA 


GAGTGAGATG 


AAATGATACT 


CTTCGAAAAT 


1740 


CTCTTCAAAC 


CAGGTCAGCT 


TCACCTTGCC 


GTAGGTATAT 


GTTACTGACT 


TCGTCAGTCT 


1800 


TATCCGGCAA CCTCAAAACG 


GTGTTTTGAG 


CTGACTTCGT 


CAGTTCTATT 


TGCAACCTCA 


1860 


AAACAGTGTT 


TTGAGCAACC 


TGTGACTAGC 


TTTCTAATCG 


ATGCCTTGGT 


TTTCATTGCC 


1920 


TATAATCAAA 


AAGAGAAATT 


TTCTCCTGAA AAGCATATAG 


AGTAGCTGGC 


GTTAAAAGCT 


1980 


CCTGTCTTGC 


TTTTTTGACC 


TATAGTCACA 


TCTATCAAGT 


ATTGTTCTTG 


CCTAAGCTAT 


2040 


CAATAAAAAG 


GTGGCATTTT 


TTAGGCTTGG 


TGTTAGTAGA 


TTTTGCCTTA 


TCCTATCTAA 


2100 


GTCATTTCGA 


ACTTTTTATG 


GTACAATGGA 


AACATGTTAT 


TCAAATTATC 


TAAGGAAAAA 


2160 


ATAGAGCTAG 


GCTTATCTCG 


TTTATCGCCA 


GCCCGTCGTA 


TTTTTTTGAG 


TTTTGCCTTG 


2220 
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GTCATTTTAC TAGGCTCTCT 


TCTTTTGAGC 


TTGCCCTTTG 


TCCAAGTTGA 


AAGCTCACGA 


2280 


GCGACTTATT TTGATCATCT 


TTTCACTGCT 


GTCTCTGCAG 


TCTGTGTGAC 


GGGTCTCTCA 


2340 


ACCCTTCCAG TAGCTCACAC 


CTATAATATC 


TGGGGTCAAA TAATCTGTTT 


GCTCTTGATT 


2400 


CAGATCGGTG GTCTAGGGCT CATGACCTTT ATTGGGGTTT TCTATATCCA GAGCAAGCAA 


2460 


AAGCTTAGTC TTCGTAGCCG 


TGCAACTATT 


CAGGATAGTT 


TTAGTTATGG 


AGAAACTCGA 


2520 


TCTTTGAGAA AGTTTGTCTA 


TTCTATTTTT 


CTCACGACCT 


TTTTGGTTGA 


GAGCTTGGGA 


2580 


GCTATTTTGC TTAGTTTTCG 


CCTTATTCCT 


CAACTTGGCT 


GGGGACGTGG 


TCTTTTTAGT 


2640 


TCCATTTTTC TAGCGATCTC 


AGCCTTCTGT 


AATGCCGGTT 


TTGATAATTT 


AGGGAGCACC 


2700 


AGTTTATTTG CTTTTCAGAC 


CGATTTACTG 


GTCAATCTGG 


TGATTGCAGG 


CTTGATTATT 


2760 


ACAGGCGGCC TTGGTTTTAT 


GGTCTGGTTT 


GATTTGGCTG 


GTCATGTAGG 


AAGAAAGAAA 


2820 


AAAGGACGTC TGCACTTTCA 


TACGAAGCTT 


GTACTATTAT 


TGACTATAGG 


TTTGTTGTTA 


2880 


TTTGGAACAG CAACTACTCT 


CTTTCTTGAG 


TGGAACAATG 


CTGGAACGAT 


TGGCAATCTC 


2940 


CCTGTTGCCG ATAAGGTTTT 


AGTTAGCTTT 


TTTCAAACAG 


TGACGATGCG 


AACAGCTGGC 


3000 


TTTTCTACGA TAGATTATAC 


TCAGGCTCAT 


CCTGTGACTC 


TTTTGATTTA 


TATCTTACAG 


3060 


ATGTTTCTAG GTGGGGCACC 


TGGAGGAACA 


GCTGGGGGAC 


TCAAGATTAC 


GACATTTTTT 


3120 


GTCCTCTTGG TCTTTGCACG 


AAGTGAGCTT 


CTAGGCTTGC CTCATGCCAA TGTTGCGAGA 


3180 


CGAACGATCG CGCCGCGAAC 


GGTTCAAAAA 


TCCTTTAGTG 


TCTTTATTAT 


CTTTTTGATG 


3240 


AGCTTCTTGA TAGGATTGAT 


TCTGCTAGGG 


ATAACAGCCA 


AAGGCAATCC 


TCCCTTTATC 


3300 


CACCTCGTAT TTGAAACCAT 


TTCAGCTCTT 


AGTACAGTTG 


GTGTAACGGC 


AAATCTGACT 


3360 


CCTGACCTTG GGAAATTGGC 


TCTCAGTGTT 


ATCATGCCAC 


TTATGTTTAT 


GGGACGAATT 


3420 


GGTCCCTTGA CCTTG'TTTGT 


TAGCTTGGCA 


GATTACCATC 


CAGAAAAGAA AG AT AT G ATT 


3480 


CACTATATGA AAGCAGATAT 


TAGTATTGGT 


TAAGAAAGGA 


AAGAGCATGT 


CAGATCGTAC 


3540 


GATTGGAATT TTGGGCTTGG 


GAATTTTTGG 


GAGCAGTGTC 


CTAGCTGCCC 


TAGCCAAGCA 


3600 


GGATATGAAT ATTATCGCTA 


TTGATGACCA 


CGCAGAGCGC 


ATCAATCAGT 


TTGAGCCAGT 


3660 


TTTGGCGCGT GGAGTGATTG 


GTGACATCAC 


AGATGAAGAA 


TTATTGAGAT 


CAGCAGGGAT 


3720 


TGATACCTGC GATACCGTTG 


TAGTCGCGAC 


AGGTGAAAAT 


CTGGAGTCGA 


GTGTGCTTGC 


3780 


GGTTATGCAC TGTAAGAGTT 


TGGGGGTACC 


GACTGTTATT 


GCTAAGGTCA 


AAAGTCAGAC 


3840 


CGCTAAGAAA GTGCTAGAAA AGATTGGAGC TGACTCGGTT ATCTCGCCAG AGTATGAAAT 


3900 


GGGGCAGTCT CTAGCACAGA 


CCATTCTTTT 


CCATAATAGT 


GTTGATGTCT 


TTCAGTTGGA 


3960 


TAAAAATGTG TCTATCGTGG 


AGATGAAAAT 


TCCTCAGTCT 


TGGGCAGGTC 


AAAGTCTGAG 


4020 
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TAAATTAGAC CTCCGTGGCA AATACAATCT GAATATTTTG GGTTTCCGAG AGCAGGAAAA 4080 

TTCCCCATTG GATGTTGAAT TTGGACCAGA TGACCTCTTG AAAGCAGATA CCTATATTTT 4140 

GGCAGTCATC AACAACCAGT ATTTGGATAC CCTAGTAGCA TTGAATTCGT AAAGAGGGAT 4200 

GACCCCTCTT TTTTGATGCC TAAGATGGCA AATAGAGACA GAAGCCCCTT GTCTTCTAGT 4260 

AAAAGTTCTT CAAAGGCTGG ACTTTATGGT AAAATAGAAA GAAGTGACAA GAGAGAGTAA 4320 

TACTCAATGA AAATCAAAGA TCAAACTAGG AAACTAGCTA CGGGCTGCTC AAAACACTGT 4380 

TTTGAGGTTG CAGATAGAAC TGACGAAGTC AGTAACATCT ATACGGCAAG GCGACGTTGA 4440 

CGCGGTTTGA AGAGATTTTC GAAGAGTATA AGAAAAAATC AGTCCCCTAA AGGAGTAGAT 4500 

TATGAAGTTA TTGTCTATCG CAATTTCTAG CTATAATGCA GCAGCCTATC TTCATTACTG 4560 

TGTGGAGTCG CTAGTGATTG GTGGTGAGCA AGTTGGGATT TTGATTATCA ATGACGGGTC 4620 

TCAGGATCAG ACTCAGGAAA TCGCTGAGTG TTTAGCTAGC AAGTATCCTA ATATCGTTAG 4680 

AGCCATCTAT CAGGAAAATA AATGCCATGG CGGTGCGGTC AATCGTGGCT TGGTAGAGGC 4740 

TTCTGGGCGC TATTTTAAAG TAGTTGACAG TGATGACTGG GTGGATCCTC GTGCCTACTT 4800 

GAAAATTCTT GAAACCTTGC AGGAACTTGA GAGCAAAGGT CAAGAGGTGG ATGTCTTTGT 4860 

GACCAATTTT GTCTATGAAA AGGAAGGGCA GTCTCGTAAG AAGAGTATGA GTTACGATTC 4920 

AGTCTTGCCT GTTCGGCAGA TTTTTGGCTG GGACCAGGTC GGAAATTTCT CCAAAGGCCA 4980 

GTATACCATG ATGCACTCGC TGATTTATCG GACAGATTTG TTGCGTGCTA GCCAGTTCTA 5040 

ACTGCCTGAA CATACTTTTT ATGTCGATAA TCTCTTTGTC TTTACGCCCC TTCAGCAGGT 5100 

CAAGACCATG TACTATCTGC CTGTCGATTT CTATCGTTAT TTGATTGGGC GTGAGGACCA 5160 

GTCTGTCAAT GAGCAAGTGA TGATTAAGTG CATTGACCAG CAACTCAAGG TCAATCGACT 5220 

CTTGATAGAC CAACTTGATT TGTCCCAAGT GAGTCATCCC AAAATGCGAG AATATCTGCT 5280 

GAATCATATT GAACTCACGA CGGTGATTTC CAGTACCCTG CTCAACCGAT CTGGAACAGC 5340 

GGAGCATCTG GCAAAAAAAC GCCAATTGTG GACCTATATT CAGCAGAAAA ATCCAGAAGT 5400 

CTTTCAGGCT ATTCGTAAGA CCATGTTGAG CCGTTTGACC AAACATTCTG TCTTGCCAGA 5460 

TCGCAAACTG TCCAATGTCG TCTATCAAAT CACCAAATCT GTTTATGGAT TTAATTAATA 5520 

TAAGTGTTTT ATAAGAGGGA TTTAAGAAAA ATTTTAACTT TTTCTTAGTC CTTTTTAATT 5580 

TCAGGAGATT ATACTAGAGT CATCAAATAA AGAAAGACTC TAAGGAGAAT CCTATGAAAT 5640 

TCAATCCAAA TCAAAGATAT ACTCGTTGGT CTATTCGCCG TCTCAGTGTC GGTGTTGCCT 5700 

CAGTTGTTGT GGCTAGTGGC TTCTTTGTCC TAGTTGGTCA GCCAAGTTCT GTACGTGCCG 5760 
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ATGGGCTCAA 


TCCAACCCCA 


GGTCAAGTCT 


TACCTGAAGA 


GACATCGGGA 


ACGAAAGAGG 


5820 


GTGACTTATC 


AGAAAAACCA 


GGAGACACCG 


TTCTCACTCA 


AGCGAAACCT 


GAGGGCGTTA 


5880 


CTGGAAATAC 


GAATTCACTT 


CCGACACCTA 


CAGAAAGAAC 


TGAAGTGAGC 


GAGGAAACAA 


5940 


GCCCTTCTAG 


TCTGGATACA 


CTTTTTGAAA 


AAGATGAAGA 


AGCTCAAAAA 


AATCCAGAGC 


6000 


TAACAGATGT 


CTTAAAAGAA 


ACTGTAGATA 


CAGCTGATGT 


GGATGGGACA 


CAAGCAAGTC 


6060 


CAGCAGAAAC 


TACTCCTGAA 


CAAGTAAAAG 


GTGGAGTGAA 


AGAAAATACA 


AAAGACAGCA 


6120 


TCGATGTTCC 


TGCTGCTTAT 


CTTGAAAAAG 


CTGAAGGGAA 


AGGTCCTTTC 


ACTGCCGGTG 


6180 


TAAACCAAGT 


AATTCCTTAT 


GAACTATTCG 


CTGGTGATGG 


TATGTTAACT 


CGTCTATTAC 


6240 


TAAAAGCTTC 


GGATAATGCT 


CCTTGGTCTG 


ACAATGGTAC 


TGCTAAAAAT 


CCTGCTTTAC 


6300 


CTCCTCTTGA 


AGGATTAACA 


AAAGGGAAAT 


ACTTCTATGA 


AGTAGACTTA 


AATGGCAATA 


6360 


CTGTTGGTAA 


ACAAGGTCAA 


GCTTTAATTG 


ATCAACTTCG 


CGCTAATGGT 


ACTCAAACTT 


6420 


ATAAAGCTAC 


TGTTAAAGTT 


TACGGAAATA 


AAGACGGTAA 


AGCTGACTTG 


ACTAATCTAG 


6480 


TTGCTACTAA 


AAATGTAGAC 


ATCAACATCA 


ATGGATTAGT 


TGCTAAAGAA 


ACAGTTCAAA 


6540 


AAGCCGTTGC 


AGACAACGTT 


AAAGACAGTA 


TCGATGTTCC 


AGCAGCCTAC 


CTAGAAAAAG 


6600 


CCAAGGGTGA 


AGGTCCATTC 


ACAGCAGGTG 


TCAACCATGT 


GATTCCATAC 


GAACTCTTCG 


6660 


CAGGTGATGG 


CATGTTGACT 


CGTCTCTTGC 


TCAAGGCATC 


TGACAAGGCA 


CCATGGTCAG 


6720 


ATAACGGCGA 


CGCTAAAAAC 


ccagccctat 


CTCCACTAGG 


CGAAAACGTG 


AAGACCAAAG 


6780 


GTCAATACTT 


CTATCAAGTA 


GCCTTGGACG 


GAAATGTAGC 


TGGCAAAGAA 


AAACAAGCGC 


6840 


TCATTGACCA GTTCCGAGCA AAyGGTACTC AAACTTACAG CGCTACAGTC AATGTCTATG 


6900 


GTAACAAAGA 


CGGTAAACCA 


GACTTGGACA 


ACATCGTAGC 


AACTAAAAAA 


GTCACTATTA 


6960 


ACATAAACGG 


TTTAATTTCT 


aaagaaacag 


TTCAAAAAGC 


CGTTGCAGAC 


AACGTTAAAG 


7020 


ACAGTATCGA 


TGTTCCAGCA 


GCCTACCTAG 


AAAAAGCCAA 


GGGTGAAGGT 


CCATTCACAG 


7080 


CAGGTGTCAA 


CCATGTGATT 


CCATACGAAC 


TCTTCGCAGG 


TGATGGTATG 


TTGACTCGTC 


7140 


TCTTGCTCAA 


GGCATCTGAC 


AAGGCACCAT 


GGTCAGATAA 


CGGTGACGCT 


AAAAACCCAG 


7200 


CCCTATCTCC 


ACTAGGTGAA 


AACGTGAAGA 


CCAAAGGTCA 


ATACTTCTAT 


CAATTAGCCT 


7260 


TGGACGGAAA TGTAGCTGGC AAAGAAAAAC AAGCGCTCAT TGACCAGTTC CGAGCAAACG 


7320 


GTACTCAAAC 


TTACAGCGCT 


ACAGTCAATG 


TCTATGGTAA 


CAAAGACGGT 


AAACCAGACT " 


7380 


TGGACAACAT 


CGTAGCAACT 


AAAAAAGTCA 


CTATTAACAT 


AAACGGTTTA 


ATTTCTAAAG 


7440 


AAACAGTTCA AAAAGCCGTT 


GCAGACAACG 


TTAAGGACAG 


TATCGATGTT 


CCAGCAGCCT 


7500 


ACCTAGAAAA 


GGCCAAGGGT 


GAAGGTCCAT 


TCACAGCAGG 


TGTCAACCAT 


GTGATTCCAT 


7560 
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ACGAACTCTT 


CGCAGGTGAT 


GGCATGTTGA 


CTCGTCTCTT 


GCTCAAGGCA 


TCTGACAAGG 


7620 


CACCATGGTC 


AGATAACGGC 


GACGCTAAAA 


ACCCAGCTCT 


ATCTCCACTA 


GGTGAAAACG 


7680 


TGAAGACCAA 


AGGTCAATAC 


TTCTATCAAG 


TAGCCTTGGA 


CGGAAATGTA 


GCTGGCAAAG 


7740 


AAAAACAAGC 


GCTCATTGAC 


CAGTTCCGAG 


CAAACGGTAC 


TCAAACTTAC 


AGCGCTACAG 


7800 


TCAATGTCTA 


TGGTAACAAA 


GACGGTAAAC 


CAGACTTGGA 


CAACATCGTA 


GCAACTAAAA 


7860 


AAGTCACTAT 


TAAGATAAAT 


GTTAAAGAAA 


CATCAGACAC 


AGCAAATGGT 


TCATTATCAC 


7920 


CTTCTAACTC 


TGGTTCTGGC 


GTGACTCCGA 


TGAATCACAA 


TCATGCTACA 


GGTACTACAG 


7980 


ATAGCATGCC 


TGCTGACACC 


ATGACAAGTT 


CTACCAACAC 


GATGGCAGGT 


GAAAACATGG 


8040 


CTGCTTCTGC 


TAACAAGATG 


TCTGATACGA 


TGATGTCAGA 


GGATAAAGCT 


ATGCTACCAA 


8100 


ATACTGGTGA 


GACTCAAACA 


TCAATGGCAA 


GTATTGGTTT 


CCTTGGGCTT 


GCGCTTGCAG 


8160 


GTTTACTCGG 


TGGTCTAGGT 


TTGAAAAACA 


AAAAAGAAGA 


AAACTAATCA 


GCTAAGGAAA 


8220 


TAAATGATGG 


ATAGTGGGCT 


GACTAAGATT 


AGTTTAACAA 


CTCAATCAGC 


AATCAGGACT 


8280 


TTCTTTCAAT 


AGCAGATTAA 


AATCATCGTA 


AAACAATAAA 


AATAGTGTTA 


TACTTAAAGC 


8340 


AGTATAGCAC 


TGTTTTTATC 


AAAGGAGAGA 


CAGATGGGAA 


AGACAATTTT 


ACTCGTTGAC 


8400 


GACGAGGTAG 


AAATCACAGA 


TATTCATCAG 


AGATACTTAA 


TTCAGGCAGG 


TTATCAGGTC 


8460 


TTGGTAGCCC 


ATGATGGACT 


GGAAGCGCTA 


GAGCTGTTCA 


AGAAAAAACC 


GATTGATTTG 


8520 


ATTATCACAG 


ATGTCATGAT 


GCCTCGGATG 


GATGGTTATG 


ATTTAATCAG 


TGAGGTTCAA 


8580 


TACTTATCAC 


CAGAGCAGCC 


TTTCCTATTT 


ATTACTGCTA 


AGACCAGTGA 


ACAGGACAAG 


8640 


ATTTACGGCC 


TGAGCTTGGG 


AGCAGATGAT 


TTTATTGCTA 


AGCCTTTTAG 


CCCACGTGAG 


8700 


CTGGTTTTGC 


GTGTCCACAA 


TATTTTGCGC 


CGCCTTCATC 


GTGGGGGCGA 


AACAGAGCTG 


8760 


ATTTCCCTTG 


GCAATCTAAA 


AATGAATCAT 


AGTAGTCATG 


AAGTTCAAAT 


AGGAGAAGAA 


8820 


ATGCTGGATT 


TAACTGTTAA 


ATCATTTGAA 


TTGCTGTGGA 


TTTTAGCTAG 


TAATCCAGAG 


8880 


CGAGTTTTCT 


CCAAGACAGA 


CCTCTATGAA 


AAGATCTGGA 


AAGAAGACTA 


CGTGGATGAC 


8940 


ACCAATACCT 


TGAATGTGCA 


TATCCATGCT 


CTTCGACAGG 


AGCTGGCAAA 


ATATAGTAGT 


9000 


GACCAAACTC 


CCACTATTAA 


GACAGTTTGG 


GGGTTGGGAT 


ATAAGATAGA 


GAAACCGAGA 


9060 


GGACAAACAT 


GAAACTAAAA 


AGTTATATTT 


TGGTTGGATA 


TATTATTTCA 


ACCCTCTTAA - 


9120 


CCATTTTGGT 


TGTTTTTTGG 


GCTGTTCAAA 


AAATGCTGAT 


TGCGAAAGGC 


GAGATTTACT 


9180 


TTTTGCTTGG 


GATGACCATC 


GTTGCCAGCC 


TTGTCGGTGC 


TGGGATTAGT 


CTCTTTCTCC 


9240 


TATTGCCAGT 


CTTTACGTCG 


TTGGGCAAAC 


TCAAGGAGCA 


TGCCAAGCGG 


GTAGCGGCCA 


9300 



WO 98/18931 



PCT/US97/19588 



318 



AGGATTTTCC 


TTCAAATTTG 


GAGGTTCAAG 


GTCCTGTAGA 


ATTTCAGCAA 


TTAGGGCAAA 


9360 


CTTTTAATGA 


GATGTCCCAT 


GATTTGCAGG 


TAAGCTTTGA 


TTCCTTGGAA 


GAAAGCGAAC 


9420 


GAGAAAAGGG 


CTTGATGATT 


GCCCAGTTGT 


CGCATGATAT 


TAAGACTCCT 


ATCACTTCGA 


9480 


TCCAAGCGAC 


GGTAGAAGGG 


ATTTTGGATG 


GGATTATCAA 


GGAGTCGGAG 


CAAGCTCATT 


9540 


ATCTAGCAAC 


CATTGGACGC 


CAGACGGAGA 


GGCTCAATAA 


ACTGGTTGAG 


GAGTTGAATT 


9600 


TTTTGACCCT 


AAACACAGCT 


AGAAATCAGG 


TGGAAACTAC 


CAGTAAAGAC 


AGTATTTTTC 


9660 


TGGACAAGCT 


CTTAATTGAG 


TGCATGAGTG 


AATTTCAGTT 


TTTGATTGAG 


CAGGAGAGAA 


9*720 


GAGATGTCCA 


CTTGCAGGTA 


ATCCCAGAGT 


CTGCCCGGAT 


TGAGGGAGAT 


TATGCTAAGC 


9780 


TTTCTCGTAT 


CTTGGTGAAT 


CTGGTCGATA 


ACGCTTTTAA 


ATATTCTGCT 


CCAGGAACCA 


9840 


AGCTGGAAGT 


GGTGGCTAAG 


CTGGAGAAGG 


ACCAGCTTTC 


AATCAGTGTG 


ACCGATGAAG 


9900 


GGCAGGGTAT 


TGCCCCAGAG 


GATTTGGAAA 


ATATTTTCAA 


ACGCCTTTAT 


CGTGTCGAAA 


9960 


CTTCGCGTAA 


CATGAAGACA 


GGTGGTCATG 


GATTAGGACT 


TGCGATTGCG 


CGTGAATTGG 


10020 


CCCATCAATT 


GGGTGGGGAA 


ATCACAGTCA 


GCAGCCAGTA 


CGGTCTAGGA 


AGTACCTTTA 


10080 


CCCTCGTTCT 


CAACCTCTCT 


GGTAGTGAAA 


ATAAAGCCTA 


AAACCCCTTT 


ACAAATCCAG 


10140 


CTATTCATGG 


TAGAATAGAT 


TTTGTGTGAA ATATCAGCAG 


GAAAGCATGA 


AGCTCGTCAA 


10200 


CAGGTGTCTT 


ATGACAAGTA 


ACCTTGGCTG 


TTTAGGCGAA 


GGGCATCTGC 


ACGG 


10254 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9769 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

CCGGCGACTA TCGATAACAC TTGACTTGGT AGCCCCACAT TTTGGACAAC GCATCCTTTC 60 

CCTCCTTATC GTTTTCTTTT CATTATACCA TTTTTTAAGC GATTCCCAAA ACAATTCTTC 120 

TTTTTGCTTG ACAAGTTTTT TGTTTTGTTG TATTATTTAA TTAAGACAAC AAGGTAAAAG 180 

AAAGGAGACT AAGATGTCCT GGACATTTGA CAACAAAAAA CCCATCTATT TACAGATTAT 240 

GGAGAAAATC AAGCTTCAGA TTGTTTCCCA TACACTGGAA CCCAATCAAC AACTTCCAAC 300 

CGTGAGGAGC TAGCTAGCGA GGCTGGTGTC AATCCCAATA CCATCCAAAG AGCCTTATCA 360 

GACCTTGAAC GAGAAGGATT TGTCTACAGC AAGCGAACAA CTGGACGATT TGTGACTAAG 420 

GATAAGGAGC TAATCGCCCA GTCACGCAAA CAATTATCAG AAGAAGAATT GGAACACTTC 480 
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GTTTCCTCCA TGACCCATTT TGGCTATGAA AAAGAAGAAC TACCAGGCGT AGTCAGTGAT 540 

TATATTAAAG GAGTTTAAGC CTATGTCATT ACTAGTATTT GAAAATGTAT CCAAATCATA 600 

TGGAGCAACA CCAGCCCTTG AAAATGTTTC TCTTGACATT CCAGCTGGAA AAATTGTCGG 660 

CCTTCTTGGG CCAAACGGCT CAGGAAAAAC AACCCTGATT AAACTAATTA ATGGCCTCTT 720 

ACAACCAGAT CAAGGACGTG TCCTCATCAA CGACATGGAC CCAAGCCCAG CAACCAAGGC 780 

CGTTGTAGCT TATTTGCCTG ATACGACCTA TCTCAATGAG CAAATGAAGG TCAAAGAAGC 840 

CCTAACCTAC TTCAAGACCT TCTATAAAGA TTGTCAGATC TTGAACGCGC CCATCATCTA 900 

CTTGCAGACC TGGGCATTGA TGAAAATAGT CGTCTCAAGA AACTATCAAA AGGAAACAAA 960 

GAAAAGGTTC AACTGATTTT GGTTATGAGC CGTGATGCTC GTCTCTATGT TTTGGACGAA 1020 

CCCATTGGTG GGGTGGATCC AGCAGCCCGT GCTTATATCC TCAATACCAT TATCAACAAC 1080 

TACTCACCAA CTTCTACCGT TTTGATTTCT ACCCACTTGA TTTCTGATAT CGAGCCAATC 1140 

TTGGATGAAA TTGTCTTCCT AAAAGACGGA AAAGTCGTCC GTCAAGGAAA TGTAGATGAT .1200 

ATTCGCTACG AGTCAGGTGA ATCCATTGAC CAACTCTTCC GTCAGaATTT AAGGCCTAAG 1260 

CAAAGGAGAT TATTTATGTT TTGGAATTTA GTTCGCTACG AATTTAAAAA TGTTAACAAG 1320 

TGGTATTTAG CCCTCTACGC AGCCGTGCTA GTCCTTTCTG CCCTCATCGG AATACAGACA 1380 

CAAGGCTTTA AAAATCTACC TTACCAAGAA AGTCAGGCTA CTATGCTACT TTTTCTAGCT 1440 

ACAGTCTTTG GTGGCTTGAT GCTTACACTT GGGATTTCAA CCATTTTCTT GATTATTAAA 1500 

CGCTTCAAAG GTAGTGTCTA CGACCGACAA GGCTATCTGA CTTTGACCTT GCCAGTTTCT 1560 

GAACACCATA TCATCACAGC CAAACTAATC GGTGCCTTTA TCTGGTCATT GATTAGCACC 1620 

GCTGTATTGG CTCTAAGTGC TGTTATTATT CTGGCTTTAA CAGCTCCAGA ATGGATTCCT 1680 

CTTTCTTATG TGATTACATT TGTAGAAACA CATCTCCCTC AGATCTTTCT TACAGGTATA 1740 

TCCTTCCTAC TAAATACTAT TTCAGGAATC CTCTGCATCT ACCTGGCTAT TTCCATTGGA 1800 

CAGCTTTTCA ATGAATACCG TACAGCACTC GCTGTTGCAG TCTACATTGG TATCCAAATC I860 

GTCATTGGAT TTATTGAACT TTTCTTCAAT CTTAGTTCTA ATTTCTATGT CAATTCACTG 1920 

GTAGGACTCA ATGACCATTT CTATATGGGA GCAGGTATAG CCATTGTTGA AGAACTCATA 1980 

TTCATAGCTA TCTTTTATCT CGGAACCTAC TACATCTTGA GAAATAAGGT TAATTTGCTT 2040 

TAAATAATTT TTACCTAGAT ATGTAACATA CTCATAGAAC AAAAGAGACC AGGCAAAAAG 2100 

TCTTTAAAAT TAGAAAACGC ATAGTATCAG GTGTTGAATA TGTACTGCcC CCCAAAAGTT 2160 

AGATTTTTTC TGTCTAACTT TTGGGGGCAG TTCATAAGAA CCTTGGTAAT ATGCGTTTTT 2220 
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TGTGAGCTGA CTTATTTCCT TTCACTATAT CGCAAAATGA AATAAGAACG GAACGATGGG 2280 

ATTTTGGAAT TCAAATCAAT TTATAAGAAT GTTTTAGAAG TAATATTATC CTATTCCAGA 2340 

TTCAGTTCAC TATACAATTG AGTTTTCAAG CAACCTGTTT ACATAATGTG TACATAATTA 2400 

GGTTCGTGAT TCCACCCTTT TCACCTTTAA AAACCTCGCT TTCGCAAGGC TCTTCTATTT 2460 

ATAAGATAAG GCACGTTTAA AGGTTTTCCA AATCCCTAAA TCATCCGTTT GAAGAACGAG 2520 

ACTAGCATAC ATGCGTCCGA TAAATCCTGT TGCTACCACC GCAAAAATCA CTGTAATAGC 2580 

AAGTGAAATC CATGCTTCTG CTCCCCCCGC ATAGTCATTA ATCGTTCGAA ACGGCATAAA 2640 

GAAGGTCGAA ATAAAGGGAA TATAAGAACC AATCTTCAAG AGGAGATTGT CACCAGCTGC 2700 

ACCTAGAGCT GTCACTCCAA AAAAACCACC CATAATCAAA ATCATCAAAG GCGACAAGGC 2760 

TTTCCCTGAG TCCTCAGGAC GAGAAACCAT AGATCCTAGG AAGGCTGCCA AGACTACGTA 2820 

CATGAAAAGA CTGATCAAAA TAAAGAGCAA GGTATTCAGT GAGATAGCAT CTCCCAAGTG 2880 

ATCCAAAATA CCAGACTGAG CCAAGAATGG CAAATCTTTA AAGAGCAAAA CGGCAGCCAG 2940 

ACCACCTACA ACATAGATCC CAATATGCGT TAAAATCACT AGAAACAGAG CCATCATCCG 3000 

CGCATAGAAA TAGTGACTTG CCCTTATGCT AGAAAAAACG ACTTCCATAA TTTTGGTGCC 3060 

TTTTTCACTG GCAACTTCCT GAGCTGTTAC ACCCGCATAG GTAATCAGAA TCATATAAAG 3120 

AAAGAATCCT AAGGCACCTG CTGCAATTGT TTGAATAAAC TTTTTATTTT CCTTGGCTTC 3180 

ATCAATCTTT TCTGTGAATT GAATTGTCTG CGCTAAGCGT TTTTCCTGCT CTTGAGACAA 3240 

GGAAGCAGTT GAACGATTAA GCTGATTTTG CAGTTCATTG AGTGTACCTG TAACCTCAAA 3300 

TTTAATTCCA TTTTCAAGCG ATGTTTCGCC ATGATAAACT GCCTTTAGAA CACTATCTTC 3360 

TTGATCAATG GTCAAATAAC CTTTTAATTT TTCTTCTTTA ATTGCTTCTT TGGCACTTGC 3420 

TTCGTCTTTA TAGTCGAAGT TAACACCATT TACATTCTTC AGTCCTTCTG CTACAGATGG 3480 

CACTGTTGTC ACTACTGCCA CTTTATTATT TTTAGCCATA GAAGAACCTT GGAGATGCCC 3540 

AATTCCTACA GAGATTCCTA AAAAGAGGAA CGGCGAAATC ACCATAAAGA AGAAACTCCA 3600 

TGACTCGACA TGTCGAAGAT AGGTTTCCTT GATTACAACC CACATATTTC TCATACTTCC 3 660 

ACTCCTGATT CTAGTTTAAA GATTTCATCG ATAGTTGGCG CTTGTTGGTC AAATGTTGCG 3720 

ATATATTGAC CTTGAGTCAA GATTGAGAAG AGTTCCCTTC CAGCGCTCTC ATCCTCCAAA 3780 

ATCAATTTCC AACTGCCTTG TTTGGTCAAG CTCACCTGTT TGACATGAGG AAGATTTTCC 3840 

AATTCTTCCT TGCTTCGTTC ACTTGAAACA AAGAGACGCG TTTTCCCGTA TTGATTGCGG 3900 

ACATCCTGAA CTGGTCCGTG CAAGACCACA CGGCCATCTC GGATCATCAG AATATCGTCA 3960 

CAAAGTTCCT CAACATTGGT CATGACATGG TCAGAAAAGA TAATGGTTGT CCGCGCTCTT 4020 
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TTTCCTGAAA 
GCTCATCCAA 
GCTGATTTCC 
TCTTCATCCA 
CCAAGTAGCG 
ATAACCAATC 
CTGATATTCT 
TCCGACTAGT 
CTTGGATCCA 
CTTGCACTCA 
GACTATTGCT 
AAACTAGGAA 
ACGAAGTCgA 
ATCTACGGCA 
TCCATTATAC 
CCTGAATTGT 
TGGAAAAAGG 
CAAGAAGGAG 
ATGTTTTCTA 
TCAAATTGAT 
TATATCTATT 
ATGGTTACCT 
AAAGTAGATT 
CTGGATGACT 
AGTAATTATT 
AGCCATAAAT 
GCTTCAGAAT 
ATTATCTCAG 
TTCGAAACCT 



AATGACTTGT 
GATAATCAGG 
TTTTGACAGA 
TTGAGGGAGT 
AACTTGTTCA 
CGAGCATAGG 
AGGAATTTCA 
CCCAAAATAC 
AAACTTTTCT 
TTATACTCCT 
GTGTAAAATA 
GCTAGCCGTA 
CTCAAAACAC 
AGGCGAAcTG 
AGCAGCAAAC 
TATTTGAGTA 
CTAATAGTTT 
TAATCCTTTA 
AGGATTATAT 
TTCTAACAAT 
ATGCACACCC 
AAGCCTAAGG 
AACAACTATC 
AACTTGAACT 
TCAGAACTGA 
TACGTCCATC 
ACATCTAAAC 
ATAAGCTATT 
AGAATGCATA 



TTGAGCAATT 
TCTGGTTCAT 
CTCTTGATTT 
TTTTCTTTGA 
AGAACTGTCA 
TCTCCTGACG 
AAATACTATG 
GACCTGGTCG 
CTAGACTTCT 
TTTTGATAGC 
TGGCCTGGAG 
GACTGCTCAA 
TGTTTTGAGG 
ACGTGGTTTG 
TTAATTTATA 
ACTCCTTTTT 
CAGACAACAT 
TCTACTAATG 
AGTAAAATGA 
GTTTTAGAAG 
CTATAGGATC 
GAACTAAGAA 
CTAAAAAATG 
TGAAATTTAG 
TTAATATTAA 
AGAGAGAGAC 
TTTAGGGAAA 
CGAAACTTAG 
TAACCTTTAG 



321 
CTGTATTAAC 
GAATCAGAGT 
TATCTGTCAG 
C7TCTTTGGC 
ATTTAGGCAT 
AATATCCTGA 
GAAAATCGTT 
CGCTTGAAAG 
TACTTCTAGC 
CTTTACAATG 
CACTTTTATA 
AGTACAGCTT 
TTGTGGATAG 
AAGAGATTTT 
CCTTCCGCTC 
CCTCGTAAAG 
TTTTATAAGA 
GACGGAACAG 
AATAAGAACA 
TAGATGTATA 
TAATGAAAAT 
AACGACTACC 
CTTGAACTAC 
CAATAATTAA 
AATTAACTAA 
TCTTACTACT 
ATGACTATTC 
AATGCTTTTA 
TTGACAGACC 



TGGGTCCAAT 
AATAATGAGC 
CTTTCCTTTC 
ATCCATGCCT 
GAGATGCGTT 
CCATCCAGAC 
GTTTTTCCAG 
TCAATACCAA 
ATCTTTCACC 
TTTTTTGTCC 
CTCAATGAAA 
TGAGGTTGCA 
AACTGACGAA 
CGAAGAGTAT 
CTCAACTGTC 
TTTTCTTCCT 
AACAAGTTCA 
AATTCAACCG 
GGACAAATTG 
CTATTCTAGT 
CACAACAGGC 
AAGGAAGTCG 
AAGTCCCCCA 
TTCACTATCT 
CAATTCAAAG 
TTTAGATTTT 
GAAAGCGCGA 
AATTTATGGA 
TATTCTAAGT 



CCACTAAAAG 
TGAATCTTCT 
ACTTCCAACC 
TTTAGAGTCG 
CTTCAGGCAG 
CGATTTCTCC 
CACCATTTTT 
ACAAAACTTG 
TCCGAAATTT 
ATTTTTAGAA 
ATCAAAGAGC 
GATAAAACTG 
kCrTAaCTAT 
TAGTGATAAA 
TATTTTTAAT 
CTAAAACTTC 
TCTGTCATTT 
CTTGTCCGAT 
ATCAGGACAG 
TTCAATCTGC 
TCATTCATAG 
CATTCATCGA 
GAGAAGACTT 
AACTATATTT 
GATTCATACT 
AGTCTTTCTA 
ATGCCTCAAA 
ATTGCGATTA 
CTCGAAGGGC 



4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 
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TATTTACTTT 


' CTATTCCTTA 


. TCAAAAAAGA 


. CTCATTCCCC 


CTTTCTCCTC 


CAAAATATGG 


5820 


TATAGTAGAA 


ATATACTATC 


TATGAGGAGT 


TTACATGTCA 


. CAGGATAAAC 


AAATGAAAGC 


5880 


TGTTTCTCCC 


CTTCTGCAGC 


GAGTTATCAA 


TATCTCATCG 


ATTGTCGGTG 


GGGTTGGGAG 


5940 


TTTGATTTTC 


TGTATTTGGG 


CTTATCAGGC 


TGGGATTTTA 


CAATCCAAGG 


AAACCCTCTC 


6000 


TGCCTTTATC 


CAGCAGGCAG 


GCATCTGGGG 


TCCACCTCTC 


TTTATCTTTT 


TACAGATTTT 


6060 


ACAGACTGTC 


GTCCCTATCA 


TTCCAGGGGC 


CTTGACCTCG 


GTGGCTGGGG 


TCTTTATCTA 


6120 


CGGGCACATC 


ATCGGGACTA 


TCTACAACTA 


TATCGGCATC 


GTGATTGGCT 


GTGCCATTAT 


6180 


CTTTTATCTA 


GTGCGCCTAT 


ACGGAGCTGC 


CTTTGTCCAG 


TCTGTCGTCA 


GCAAGCGCAC 


6240 


CTACGACAAG 


TACATCGACT 


GGCTAGATAA 


GGGCAATCGT 


TTTGACCGCT 


TCTTTATTTT 


6300 


TATGATGATT 


TGGCCCATTA 


GCCCAGCTGA 


CTTTCTCTGT 


ATGCTGGCTG 


CCCTGACCAA 


6360 


GATGAGCTTC 


AAGCGCTACA 


TGACCATCAT 


CATTCTGACC 


AAACCCTTTA 


CCCTCGTGGT 


6420 


TTATACCTAC 


GGTCTGACCT 


ATATTATTGA 


CTTTTTCTGG 


CAAATGCTTT 


GACACGTAAA 


6480 


AAATCCGTTT 


GGTTTCCCAA 


GTGGATTTTT 


AAAGCGTAGA 


TTAACTATAG 


CTTGATACTA 


6540 


AATATACTTT 


GGTATGGAAA 


TCATGCATAT 


TTTTCGATAG 


TGAGGCGAGG 


ACTTACCTAG 


6600 


CCTTTCCGCC 


GTGATAGAAA 


CACCTGAAAT 


CTAATGGTTT 


CAGGTATTCG 


GAAACTTTGA 


6660 


GCCTAGTGTC 


TCAAAGTTTA 


GGTATGGAAT 


TTTGAAGAAA 


GTCGCTACCG 


TCCGTAATCA 


6720 


CTTAAGGAAA 


GGCTCAAAAA 


TATTGTTTTC 


AACCACAAAA 


TCCGTTTGGT 


TTCCCAAGCG 


6780 


GATTTTGTGC 


TTTATTTTGA 


AACTTCTTTT 


GCAAGAACAA 


AGTTCCCAAG 


TGTGGCAGAA 


6840 


CCATTTCCTG 


CGACTGCTGG 


CGTCACGATA 


TAGTCACGCA 


CATCTGGTAC 


TGGTAGGTAA 


6900 


CCATTAAGAA 


GAGATGTAAA 


TTTCTCACGG 


ACACGGTCCA 


GCATATGTTG 


TTGAGCCATG 


6960 


ACCCCTCCAC 


CAAAGACAAT 


CACGTCTGGG 


CGGAAAGTCA 


CTGTCGCATT 


AACCGCAGCT 


7020 


TGAGCGATAT 


AGTAGGCTTG 


AACATCCCAA 


ACAGGGTTGT 


TGAGTTCAAT 


AGTTTCCCCA 


7080 


CGTACACCTG 


TACGAGCTTC 


CAAACTTGGA 


CCAGCTGCAT 


AACCTTCTAG 


ACATCCCTTA 


7140 


TGGAAAGGAC 


AAACACCCTT 


AAACTCTTTT 


TCAATATCCA 


TTGGGTGTCT 


AGCAACATAA 


7200 


TAATGACCCA 


TTTCAGGGTG 


ACCCACACCA 


CCGATAAACT 


CACCACGTTG 


GATGACGCCT 


7260 


GCACCGATAC 


CTGTACCGAT 


TGTGTAGTAA 


ACCAAGTTTT 


CGATACGACC 


ACCAGCATTG 


7320 


TTACGGGCAA 


CCATTTCACC 


GTAAGCAGAG 


CTGTTTACGT 


CTGTTGTGAA 


GTACATTGGC " 


7380 


ACGTTTAGGG 


CGCGACGAAG 


GGCACCAAGC 


AAGTCTACAT 


TTGCCCAGTT 


TGGTTTTGGA 


7440 


GTCGTCGTGA 


TAAAGCCATA AGTTTTTGAG 


TTTTTGTCAA 


TATCAATCGG 


CCCAAATGAA 


7500 


CCAACTGCAA ' 


GACCAGCAAG i 


GTTATCGAAT 


TTTGAGAAGA 


ACTCAATGGT 


TTTATCGATT 


7560 
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GTTTCGATTG GAGTTGTTGT TGGAAATTGT GTTTTTTCTA CAACGTTAAA GTTTTCATCA 7620 

CCGACAGCAC AGACAAACTT TGTACCGCCC GCTTCCAAGC TTCCATATAA TTTTGTCATG 7680 

ATAAACCTCT TGTTTTTATT TTCTTTATTA TAGCATACTT CGAAAGTCTA AATGTCTCTA 7740 

TTTTTTAGAT TTTCCTCTGT AAATCTTACT ATCTAATAAA AACGAACAAA CATGTCATTT 7800 

GTTCGTTTTC ACATTAGAGA GGATTGATTA GATTTTCACT TCGATCACAG CATCCCCCTT 7860 

AGCAACTGAA CCTGTTGCGA CTGGAGCTAC TGAAGCGTAG TCACCTGTAT TTGTAACGAT 7920 

AACCATTGTT GTATCATCAA GTCCAGCTGC AGCGATTTTG TTTGAGTCAA ATGTTCCAAG 7980 

AACATCGCCA GCTTTCACCT TATTACCTTG AGCAACTTTT GTTTCAAAAC CGTCACCGTT 8040 

CATAGATACA GTATCAATAC CAACATGAAT CAAAACTTCA GCACCATTTC TTGTTTTCAA 8100 

ACCAAAAGCG TGCCCTGTTG GAAAGGCAAT TGAAACTTCA GCATCAGCTG GTGCATAGAC 8160 

CACGCCTTGG CTTGGTTTCA CAACGATACC TTGTCCCATA GCTCCACTTG AGAAGACTGG 8220 

GTCATTGACA TCAGCAAGAG CGACAACATC ACCGACGATA GGAGTTACAA GTGTTTCATT 8280 

TTGAAGAGCT GCTGGCGCAA CTTCTTCTTT TTCTTCAGCC ACTTCAGCTC GTTTTGCAGC 8340 

TGCAGTTGCG TCTACTTCAT CTTCGTAACC AAACATGTAA GTAAGAGCAA AACCAAGGGC 8400 

AAATGATACA GCTACCATAA GAAGGTATTG TGGAAGTTGT CCGTTACCAA CATAAAGCAT 8460 

TGTACCAGGG ATGATGGTGA TACCATTACC AGTACCAGCA AGTCCAAGGA TAGAAGCCAA 8520 

TCCACCACCG ATTGCACCAG CAATCAATGA AAGGAAGAAT GGTTTACGGA AGCGCAAGTT 8580 

CACCCCGAAG ATAGCAGGCT CTGTAATACC TAGGAAGGCA GAAAGAGCAG CCGGGAAAGC 8640 

AAGTGTTTTC AGTTTTGGAT TTTTTGTTTT AACACCAACC GCAACAGTAG CAGCACCTTG 8700 

AGCTGTCATA GCAGCTGTGA TGATAGCGTT GAATGGGTTA GCATGGTCAG CAGCAAGTAA 8760 

TTGCACTTCA AGCAAGTTGA AGATGTGGTG CACACCTGAC ACGACGATCA ATTGGTGAAC 8820 

CCCACCAATC AAGAAACCAC CAAGACCAAA TGGCATGCTA AGAATCGCTT TTGTAGCAAT 8880 

AAGGATGTAG TTTTCAACAA CGTGGAAAAC TGGTCCAATG ACAAAGAGTC CAAGGATAGA 8940 

CATGACCAAA AGTGTCACGA ATGGTGTTAC CAAGAGGTCA ATGACATCTG GAACAACTTG 9000 

CGGACAGCTT TTTCAAATTT AGCTCCGACA ACCCCGATGA TGAAGGCTGG AAGAACGGAA 9060 
CCTTGCAAAC CAACAACAGG GATGAAACCA AAGAAGTTCA TCGCTGTTAC TTCACCACCT . 9120 

TGAGCAACTG CCCAAGCGTT TGGAAGTGAG CCAGAGACAA GCATCATACC AAGAACGATA 9180 

CCAACGGCAG GATTTCCACC AAATACACGG AAGGTTGACC ACACAACCAA ACCTGGCAAG 9240 

ATGATGAAGG CTGTATCTGT CAAGATTTGT GTGTAAGTTG CAAAGTCACC TGGAAGTGGC 9300 
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ATTTCAAGAG CGTTGAAAAG ACCACGCACA CCCATGAAGA GACCTGTCGC TACGATAACT 



9360 



GGGATGATTG GAACGAAAAC ATCACCAAAA GTACGGATAG CACGTTGGAA CCAGTTCCCT 



9420 



TGTTTAGCAA CTTCTGCTTT CATGTCATCC TTAGATGATG TTGGTAATCC AAGTACAACA 



94B0 



ACTTCATCGT ACATTTTGTT AACTGTACCT GTACCAAAGA TAATTTGGTA TTGCCCTGAG 



9540 



TTAAAGAAAG CACCTTGAAC TTTTTCCAAG TTCTCAATCA CTTCTTTATT GATTTTCTCT 



9600 



TCATCTTTGA CCATGACACG TAGACGAGTC GCACAGTGGG CAACACTA1T GACATTTTCA 



9660 



CGTCCGCCCA AGGCATCGAT GACTTTTTTT GCAATTTCCT GATTGTTCAT TTGCAAAAAT 



9720 



CTCCTTATAT AACATTTTGT TCTTGTTTGA AAGCGATTTT ATTCGCCGG 



9769 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3149 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

CGCTTGAGTG CTAATTCATA GTTCTATTGT ATCACTTGGT CAGAAATAAT CAAGAAAAAA 60 

GTCTGACTTT CTCAAGATAA AAAGCCTGAG ACCAACTCAG ACTTTTTAAT TCTTAAAATG 120 

GCAATTCTTC CTCTTCCAAG ACCAAATCTG CCAAATCTTG GCCTGCATTA TTTTCACGCA 180 

TAGCACGTTG GGCACGACTT TCCAAGAGTT GGAATCCTGT GACAAGTACT TCGGTCACGT 240 

AGTTCATTTG GCCATTTTTC TCAAAGCGAC GGGTACGCAA TTCTCCATCA ACGGAAATGA 300 

GACTACCTTT GGTTGCGTAC TTGCCAAAGT TTCTGCTAGT CTGCCCCATA GGACCATATT 360 

GACAAAATCA GCTTCACGTT CACCGTTTTG GTCTTTGTAA CGACGGTTCA CAGCGATAGT 420 

TGCTCGCGCT ACCGACTTGT CATTGTTGGT TTTGTGCAAT TCTGGTGTAG ACGTTAAACG 480 

TCCAATCAAG ATAACTTTAT TATACATATT TTCTTCCTCC TACTTATCTA TTCGTAGGAA 540 

ATCAAAAAAA GTTACAGAAA TTTGTAACTT TTCGAGAAAA TTTTTTATTT TTTATGAACC 600 

ATGAAACCTG TCGCCTGTTG ATTGGCCATA ATGGTCATAT CTGTAATCTG AACACGACGA 660 

GGTTGACTAG TCACATAGAC TACTGTATCT GCAATATCCT GAGCTTGCAA AGCTTCTATT 720 

CCTTGGTAAA CGGACGCAGC TCGTTCTTTA TCACCATGAA AACGCACTGT AGAAAAATCT 780 

GTTTCGACAA TTCCAGGCTG AATGGTCGTC ACCTTGATAT CCGTTGCGAT GGTATCAATT 840 

CGCAGTCCAT CTGAAAAGGT CTTAACTGCC GCCTTGGTGG CTGAGTAAAC AGCTGCACCA 900 

GCATAGGCAT AAATTCCTGC GGTTGACCCC ATATTGATAA TATGACCTTG ATTGGCTTTT 960 
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ACCATTGCTG GCAAGAAACA GCGAGTGACT GCCATCAAAC CTTTGACATT GGTATCCAAC 1020 

ATGGTCAGCA TATCCAACTC TTCATAGTCT TGATAGGGAG CTAAGCCAAG AGCCAGTCCT 1080 

GCGTTATTGA CCAGGATGTC AATCTGACCT ATCGTTTCTA AAATATCAGA GCAGACAGTC 1140 

TTTACCATTG TCATATCCGT GACATCTAGG AGAAAAGTCC AAACTGTTTG ATTTGGAAAA 1200 

GTTTCTGCAA ACTCCGCCTT AAGAGCTTCT AGTCTGTCTA TCCGTCGTCC TGTTAGAACG 1260 

ACATCCTCAC CCTGCTCCAG ATAAGCACGC GCAATCGCTT CACCGATTCC TGATGTCGCT 1320 

CCTGTAATCA CAACATTTTT TGCCATCTTA TTTCCTTCTA GCTGGTCTAT CAGATATTAA 1380 

CAACTTCTTA GGCAGTCCAG TGTTTCGCTG GGTCGAACGG TGTTCCGACA ACTTGGTCTT 1440 

CTGATAATTC AAGCACCCCA CGTTTTTGTG GAGCATTTGG CAGATGCAAT TCACGAGGAC 1500 

TGCACATCAT ACCAAAACTC TTTTCACCAC GAAGTTCACC TGGGAAAATG AGATTCCCTT 1560 

TTGGCATCAT AGCTCCAGGA AGCGCGACAA TGGTTTTCAA CCCCACACGC GCATTGGGAG 1620 

CTCCTGCAAC GATTTGTACA GTCTTATCAC TTGCGACTGC AACTTGGCAG ATGTTGAGGT 1680 

GGTCACTATC TGGATGGGCT ACCATCTCAA CAATTTCACC TACAACAAAC TTAGGTTCCT 1740 

TATCATTAAC AATTTCTTCT GTAAAACCTT CCGCCTGCAA CTCTTGGTTC AAACGAGCGA 1800 

CTTGCTCATC TGTCAAAAAG ACTTGACCGC GCTCTGCAAT TTCAAATAAA CTTGAAACTT 1860 

CGAAAATATT CCAAGCCACT GTTTCCCCAT TATCTTTGAG AAAAACACGG GCTACCTTGC 1920 

CTTTGCGCTC CACATCCAGT TTGGCATCTC CGCTATTTTT CACGATGACC ATAAGGACAT 1980 

CACCGACATG TTCTTTATTA TATGTAAAAA TCATTGTTTC CTTTTTCTCC TATTTCAGTC 2040 

CTGCTAAAAA GTCATTGATT TGTTGCTTGC TTTTACGGTC GCGATTGACA AAACGACCGA 2100 

TTTCCTTGTC CTTTTCTAGA ACAACAAGGC TAGGAATTCC GTAAACATCC CAGAGTTTGG 2160 

CCAAATCCAT ATACTGATCT CGGTCCATTC GAATAAAGGT GAACTCTGGA TTGGTCTCCT 2220 

CAATCTCTGG TAAGGCAGGA TAAATATAAC GACAATCGCT ACACCAGTCT GCCACAAAAA 2280 

TGAAGACCTT CTTGCCCGCT TTTTCCACTA AAGATGCTAA TTCTTCTAAA CTTGCTGGCT 2340 

GTATCATAAG ACTTCCTCCT CATAGACTAG GTCTTCATTT TCATAGACAA AGGTATAATG 2400 

ACGGCCATCC TCAAAAATGA CGCCACCAAC CAAGCTCTCC AGACTGCTTT CGTAAACTTG 2460 

AACATAAAGG GTCGCAATTT CCCCCATGTC GGAAAAATGG TCTCGCACAA TCTCTGTCAA 2520 

CTCTTCCTGA GTCTTCATGA GCTTACGGTC ATCTGCAACT TTTTTOGTAG CAAGAGCAAG 2580 

GCTTCCGATA CCTAGCAGAG CCAAGCCTGC CATCCACATT TTTTTAGCTT TCATACCATT 2640 

CATTTTAACA CAAAAAAGGC TTCAGGACAA ATGAGGAAGC AGCAGAAAAG CAAGTAAAAA 2700 
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GCCTCTTCCT TTAAGGAAAA GGACTTCTTA TACTCAATGA AAATCAAAGA CCAAACTAGG 2760 

AAGCTAGCCG CAGGCTGCTC AAAGCACTGC TTTGAGGTTG TAGATAGAAC TGACGAgTCa 2820 

CTCAAAACAC TGTTTTGAGG TTGTGGATGA AGCTGACGTG GTTTGAAGAG ATTTTCGAAG 2880 

AGTATTATTC TTATTGCCAG GCACCTAAGT TGCCAACGTA GTAACTATCA GGTGTGTAGG 2940 

TATTGCGAGC ATCTTACCTG ATGAAGCCAG ATAATACTAC TTGCCATTGT CTTTGACCCA 3000 

ATCATTCGCA ATCATGGAAC CAGAAGAACT TACATAATAC CATTCTCCCT TGTCATAAAC 3060 

CCAAGTACTG ACTTTCATGG TTCCTGAGCA ATTAAAGGCA AAAAAACTGT CCAATAACAT 3120 

TCGTTTTTTA AAAGCATTTG ACACTACAT 3149 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10240 base pairs 

(B) TYPE: nucleic acid 
(CJ STRANDEDNESS : double 
(D) TOPOLOGY: linear 



(xa) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 



CCAAAAATTC 


AACCTTTAAG 


GGGAGTCCAG 


AGAGACTCAC 


AAGGTGTCAG 


ATAAAAGAAT 


60 


GGTGCAATTT 


TCTAGAGGAG 


ACTTTTTGAG 


TGTGCTCTCT 


TGTGTTGTAC 


GATTTTAACT 


120 


GAGGCCTTGC 


ACTAGCAAGG 


TCTTTTCTTT 


ATCTGGTCCC 


CTTAAAATTT 


AAGGAGGAAA 


180 


AGTTATGAAT 


CCCACATGTA 


AGAAGCGTTT 


GGGTGTCATT 


CGGTTGGAAA 


CCATGAAGGT 


240 


GGTTGCACAA 


GAGGAAATCG 


CGCCACAATC 


TTTGAATTAG 


TCCTAGAAGG 


AGAAATGGTT 


300 


GAAGCCATGC 


GAGCAGGCCA 


ATTTCTTCAT 


CTGCGTGTAC 


CGGACGATGC 


CCATCTCTTA 


360 


CGTCGTCCTA 


TTTCAATTTC 


GTCTATTGAC 


AAGGCAAACA 


AGCAGTGTCA 


CCTCATTTAT 


420 


CGGATTGACG 


GAGCTGGGAC 


TGCAATTTTT 


TCAACCTTAA 


GTCAGGGAGA 


CACTCTTGAT 


480 


GTGATGGGGC 


CTCAGGGAAA 


TGGTTTTGAC 


TTGTCTGACC 


TTGATGAGCA 


GAATCAGGTT 


540 


CTCCTTGTTG 


GTGGTGGGAT 


TGGTGTTCCA 


CCCTTGCTTG 


AGGTGGCCAA 


GGAATTGCAT 


600 


GAACGTGGAG 


TGAAAGTAGT 


GACAGTCCTC 


GGTTTTGCTA 


ATAAGGATGC 


TGTTATTTTG 


660 


AAAACGGAAT 


TGGCTCAGTA 


TGGTCAGGTC 


TTTGTAACGA 


CAGATGATGG 


TTCTTATGGC 


720 


ATCAAGGGAA 


ATGTTTCCGT 


TGTTATCAAT 


GATTTAGACA 


GTCAGTTTGA 


TGCTGTTTAC 


780 


TCGTGTGGGG 


CTCCAGGAAT 


GATGAAGTAT 


ATCAATCAAA 


CCTTTGATGA 


TCACCCAAGA 


840 


GCCTATTTAT 


CTCTGGAATC 


TCGTATGGCT 


TGTGGGATGG 


GAGCTTGCTA 


TGCCTGTGTT 


900 


CTAAAAGTAC 


CAGAAAACGA 


GACGGTCAGC 


CAACGCGTCT 


GTGAAGATGG 


TCCTGTTTTC 


960 
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CGCACAGGAA 


CAGTTGTATT 


ATAAGGAGAA 


AATTATGACT 


ACAAATCGAT 


TACAAGTTTC 


1020 


TCTACCTGGT 


TTGGATTTGA 


AAAATCCGAT 


TATTCCAGCA 


TCAGGCTGTT 


TTGGCTTTGG 


1080 


ACAAGAGTAT 


GCCAAGTACT 


ATGATTTAGA 


CCTTTTAGGT 


TCTATTATGA 


TCAAGGCGAC 


1140 


AACCCTTGAA 


CCACGTTTTG 


GGAATCCAAC 


TCCAAGAGTG 


GCAGAGACGC 


CTGCTGGTAT 


1200 


GCTCAATGCA 


ATTGGCTTGC 


AAAATCCTGG 


TTTAGAGGTT 


GTTTTGGCTG 


AAAAGCTACC 


1260 


TTGGCTGGAA 


AGAGAATATC 


CAAATCTTCC 


TATTATTGCC 


AATGTAGCTG 


GTTTTTCAAA 


1320 


ACAAGAGTAT 


GCAGCTGTTT 


CTCATGGGAT 


TTCCAAGGCA 


ACTAATGTAA 


AAGCTATCGA 


1380 


GCTCAATATT 


TCTTGTCCCA 


ATGTTGACCA 


CTGTAATCAT 


GGACTTTTGA 


TTGGTCAAGA 


1440 


TCCAGATTTG 


GCTTATGATG 


TGGTGAAAGC 


AGCTGTGGAA 


GCCTCAGAAG 


TGCCAGTTTA 


1500 


TGTCAAATTA 


ACCCCGAGTG 


TGACCGATAT 


CGTTACTGTC 


GCAAAAGCTG 


CAGAAGATGC 


1560 


GGGAGCAAGT 


GGCTTGACCA 


TGATCAATAC 


TCTGGTTGGA 


ATGCGCTTTG 


ACCTCAAAAC 


1620 


TAGAAAACCA 


ATCTTGGCCA 


ATGGAACAGG 


TGGAATGTCT 


GGTCCAGCAG 


TCTTTCCAGT 


1680 


AGCCCTCAAA 


CTCATCCGCC 


AAGTTGCCCA 


AACAACAGAC 


CTGCCTATCA 


TTGGAATGGG 


1740 


AGGAGTGGAT 


TCGGCTGAAG 


CTGCCCTAGA 


AATGTATCTG 


GCTGGGGCAT 


CTGCTATCGG 


1800 


AGTTGGAACA 


GCTAACTTTA 


CCAATCCTTA 


TGCCTGCCCT 


GACATCATCG 


AAAATTTACC 


1860 


AAAAGTCATG 


GATAAATACG 


GTATTAGCAG 


TCTGGAAGAA 


CTCCGTCAGG 


AAGTAAAAGA 


1920 


GTCTCTGAGG 


TAAACTGCAA 


TCAATCTGTT 


CTTGATTTTT 


TATTAGTTTG 


TAATATGAAT 


1980 


TTAGGAGAAT 


TTTGGTACAA 


TAAAATAAAT 


AAGAACAGAG 


GAAGAAGGTT 


AATGAAGAAA 


2040 


GTAAGATTTA 


TTTTTTTAGC 


TCTGCTATTT 


TTCTTAGCTA 


GTCCAGAGGG 


TGCAATGGCT 


2100 


AGTGATGGTA 


CTTGGCAAGG 


AAAACAGTAT 


CTGAAAGAAG 


ATGGCAGTCA 


AGCAGCAAAT 


2160 


GAGTGGGTTT 


TTGATACTCA 


TTATCAATCT 


TGGTTCTATA 


TAAAAGCAGA 


TGCTAACTAT 


2220 


GCTGAAAATG 


AATGGCTAAA 


GCAAGGTGAC 


GACTATTTTT 


ACCTCAAATC 


TGGTGGCTAT 


2280 


ATGGCCAAAT 


CAGAATGGGT 


AGAAGACAAG 


GGAGCCTTTT 


ATTATCTTGA 


CCAAGATGGA 


2340 


AAGATGAAAA GAAATGCTTG 


GGTAGGAACT 


TCCTATGTTG 


GTGCAACAGG 


TGCCAAAGTA 


2400 


ATAGAAGACT GGGTCTATGA 


TTCTCAATAC 


GATGCTTGGT 


TTTATATCAA 


AGCAGATGGA 


2460 


CAGCACGCAG 


AGAAAGAATG 


GCTCCAAATT 


AAAGGGAAGG 


ACTATTATTT 


CAAATCCGGT- 


2520 


GGTTATCTAC 


TGACAAGTCA 


GTGGATTAAT 


CAAGCTTATG 


TGAATGCTAG 


TGGTGCCAAA 


2580 


GTACAGCAAG 


GTTGGCTTTT 


TGACAAACAA 


TACCAATCTT 


GGTTTTACAT 


CAAAGAAAAT 


2640 


GGAAACTATG 


CTGATAAAGA 


ATGGATTTTC 


GAGAATGGTC 


ACTATTATTA 


TCTAAAATCC 


2700 
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GGTGGyTACA TGGCAGCCAA TGAATGGATT TGGGATAAGG AATC1TGGTT TTATCTCAAA 2760 

TyTGATGGGA AAATrGCTGA AAAAGAATGG GTCTACGATT CTCATAGTCA AGCTTGGTAC 2820 

TACTTCAAAT CCGGTGGTTA CATGACAGCC AATGAATGGA TTTGGGATAA GGAATCTTGG 2880 

TTTTACCTCA AATCTGATGG GAAAATAGCT GAAAAAGAAT GGGTCTACGA TTCTCATAGT 2940 

CAAGCTTGGT ACTACTTCAA ATCTGGTGGC TACATGGCGA AAAATGAGAC AGTAGATGGT 3000 

TATCAGCTTG GAAGCGATGG TAAATGGCTT GGAGGAAAAA CTACAAATGA AAATGCTGCT 3060 

TACTATCAAG TAGTGCCTGT TACAGCCAAT GTTTATGATT CAGATGGTGA AAAGCTTTCC 3120 

TATATATCGC AAGGTAGTGT CGTATGGCTA GATAAGGATA GAAAAAGTGA TGACAAGCGC 3180 

TTGGCTATTA CTATTTCTGG TTTGTCAGGC TATATGAAAA CAGAAGATTT ACAAGCGCTA 3240 

GATGCTAGTA AGGACTTTAT CCCTTATTAT GAGAGTGATG GCCACCGTTT TTATCACTAT ^300 

GTGGCTCAGA ATGCTAGTAT CCCAGTAGCT TCTCATCTTT CTGATATGGA AGTAGGCAAG 3360 

AAATATTATT CGGCAGATGG CCTGCATTTT GATGGTTTTA AGCTTGAGAA TCCCTTCCTT 3420 

TTCAAAGATT TAACAGAGGC TACAAACTAC AGTGCTGAAG AATTGGATAA GGTATTTAGT 3480 

TTGCTAAACA TTAACAATAG CCTTTTGGAG AACAAGGGCG CTACTTTTAA GGAAGCCGAA 3540 

GAACATTACC ATATCAATGC TCTTTATCTC CTTGCCCATA GTGCCCTAGA AAGTAACTGG 3600 

GGAAGAAGTA AAATTGCCAA AGATAAGAAT AATTTCTTTG GCATTACAGC CTATGATACG 3660 

ACCCCTTACC TTTCTGCTAA GACATTTGAT GATGTGGATA AGGGAATTTT AGGTGCAACC 3720 

AAGTGGATTA AGGAAAATTA TATCGATAGG GGAAGAACTT TCCTTGGAAA CAAGGCTTCT 3780 

GGTATGAATG TGGAATATGC TTCAGACCCT TATTGGGGCG AAAAAATTGC TAGTGTGATG 3840 

ATGAAAATCA ATGAGAAGCT AGGTGGCAAA GATTAGTACT ATAAGTGAAT ATGATTTGAG 3900 

TGAATAGTAA GTTAAAAATC CTGATTTCAA GTAAAATCAG GATTTTTTCA TGGATGCAAT 3960 

TTTTTTGGAG TCTGGTGTGA CGCGGAGGGT CTTTTGTCCT GTGTAAGTGA CAAAGCCGGG 4020 

TTTTCCACCA GTTGGTTTAT TGAGTTTTTT GACTTCAATC ATATCTACCT GCACCAGA1T 4080 

CGACAGGCGC CCTTGAGAGA AGTAGGCAGC TAACTCTGCT GCGTCTGTCT TGACTGCATC 4140 

AGATGGGTCA AGATTTCCTG AGATGACAAC ATGGCTTCCA GGAATGTCCT TAGCATGGAA 4200 

CCAAAGTTCC TCCTTGCGGG CCATTTTAAA GGTCAATTCC TCATTTTGAA GATTGTTTCG 4260 

TCCGACATAG ATGATGGTTT TGCCATCGCT TGCTAGATAT TGTTCTAGTT TTTTGCGTTT 4320 

CTGGATTTTC TCCCGTTGTC TTCTGCGGAT AAAACCTGTT TGAATCAATT CTTCACGGAT 4380 

TTCAGCGATT TCTTCCAGTC CAGCTTGGTT GAGGACGGTT TCTACACTTT CCAGATAGAG 4440 

AATAGTGGCT TTGGTTTCTT CAATCAAATC AGTCAAGTAT TTGACAGCTT CTTTGAGTTT 4500 
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CTGATACCGT TTAAAATAGC GTTGGGCATT CTGGTTGGGA GTCAGAGCCT TATCAAGCGC 4560 

AATCATGATA GGTTGGTTGG TATAGTAGTT GTCTAGGATA ACCTGGTCTT GGTCGTTAGG 4620 

CACTTGGTGG AGGAAGGTTG TCAGCAATTC TCCTTTTTGA CGAAATTCTT CAGCGTTGTC 4680 

TGTCGCCAGT AACTCTTTTT CCTGTTTTTT GAGTTTGTGT CGGTTTTTCT GAAGTTCATT 4740 

TTCAACACGA CGAATCAGTT CACTGGCCTG CTGTTTGACG CGGTCGCGCT CAGCCTTATC 4800 

CTTATAGTAG GTGTCCAACA AATCAGAAAG ATTTGCAAAA GGCTCTCCCA CCTGATTTGC 4860 

AAAAGGAACT GGACTGAAGG AAGTCTCAGT CAAGCATGGC TTGGTTTCTT GATTGAAAAA 4920 

ATTTCGGAAA GCGGAAAGTT TTTCACTAAC CAGTATCCTT TCCAATTCAT TTGCCGTATC 4980 

GCGTCCCAGA CCTTGAAAGA GGCTTTGAAG ATTTTTTGCT GTTAGTTCTT GGGTTTGCAG 5040 

GATTTCAAAG AGCTTTTCAT CCTTGATAGT AAAAGGATTG AGAGATTTTG TACTTGGCGG 5100 

AGCGATATAG GTCGATCCTG GAAGTAAGGT GCGGTAGCTA TTTTGTGAAA AGCCGACGTG 5160 

TTTGATAACT TCGAGGATTT TATGACTGCT TTTATCGACC AGTAGAATAT TACTGTGTTT 5220 

CCCCATAATT TCGATAATCA AGGTAGCCTG GATATGGTCT CCAATCTCGT TTTTATTGGA 5280 

AACTGTAATT TCCACAATAC GGTCATTTTC CACTTGCTCA ATCGACTCAA TCAGGGCCCC 5340 

CTGCAAATAC TTTCTCAAAA CCATGATAAA GGTAGAAGGT TGAGCTGGAT TTTCAAAAGT 5400 

CGTTTGGGTC AGCTGAATGC GTCCAAAAAC TGGATGGGCA GAAAGGAGCA GGCGATGGCT 5460 

TTGGCGATTG CTGCGGATTT GCAAGACCAA CTCTTGTTCA AAAGGCTGAT TGATTTTCTG 5520 

GATGCGACCA TTCACTAATT CGCTTCGCAA TTCCTCAACT ATGTGGTGTA AAAAAAATCC 5580 

GTCAAATGAC ATCGTTCTCT CCTTGTGATT GTATTCCATA GTATTATATC AAAAAGGTAG 5640 

AATAAAATCA TGGAAATGTG GTATAATAAA GCCAAGTAAA GAGAAACGAG AAGCACATGT 5700 

ATATTGAAAT GGTAGATGAA ACTGGTCAAG TTTCAAAAGA AATGTTGCAA CAAACCCAAG 5760 

AAATTTTGGA ATTTGCAGCC CAAAAATTAG GAAAAGAAGA CAAGGAGATG GCAGTCACTT 5820 

TTGTGACCAA TGAGCGTAGT CATGAACTTA ATCTGGAGTA CCGTAACACC GACCGTCCGA 5880 

CAGATGTCAT CAGCCTTGAG TATAAACCAG AATTGGAAAT TGCCTTTGAC GAAGAGGATT 5940 

TGCTTGAAAA TTCAGAATTG GCAGAGATGA TGTCTGAGTT TGATGCCTAT ATTGGGGAAT 6000 

TGTTCATCTC TATCGATAAG GCTCATGAGC AGGCCGAAGA ATATGGTCAC AGCTTTGAGC- 6060 

GTGAGATGGG CTTCTTGGCA GTACACGGCT TTTTACATAT TAACGGCTAT GATCACTACA 6120 

CTCCGGAAGA AGAAGCGGAG ATGTTCGGTT TACAAGAAGA AATTTTGACA GCCTATGGAC 6180 

TCACAAGACA ATAAACGAAA ATGGAAAAAT CGTGACTTGA TATCCAGTTT AGAATTTGCT 6240 
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TTGACAGGTA TTTTTACTGC TATCAAGGAA GAACGCAATA TGCGAAAACA CGCAGTGACG 6300 

GCTCTAGTGG TCATCCTTGC AGGTTTTGTT TTTCAGGTGT CACGAATCGA ATGGCTCTTT 6360 

CTCCTATTGA GTATTTTCTT GGTAGTAGCC TTTGAGATTA TCAACTCTGC TATTGAAAAT 6420 

GTGGTGGATT TGGCCAGTCA CTATCACTTT TCCATGCTGG CTAAAAATGC CAAGGATATG 6480 

GCGGCCGGCG CGGTATTAGT GGTTTCTCTT TTCGCAGCCT TAACAGGCGC ATTGATTTTT 6540 

CTCCCACGAA TCTGGGATTT ATTATTTTAA ACAGTAAGAG GAAATTATGA CTTTTAAATC 6600 

AGGCTTTGTA GCCATTTTAG GACGTCCCAA TGTTGGGAAG TCAACCTTTT TAAATCACGT 6660 

TATGGGGCAA AAGATTGCCA TCATGAGTGA CAAGGCGCAG ACAACGCGCA ATAAAATCAT 6720 

GGGAATTTAC ACGACTGATA AGGAGCAAAT TGTCTTTATC GACACACCAG GGATTCACAA 6780 

GCCTAAAACA GCTCTCGGAG ATTTCATGGT TGAGTCTGCC TACAGTACCC TTCGCGAAGT 6840 

GGACACTGTT CTTTTCATGG TGCCTGCTGA TGAAGCGCGT GGTAAGGGGG ACGATATGAT 6900 

TATCGAGCGT CTCAAGGCTG CCAAGGTTCC TGTGATTTTG GTGGTGAATA AAATCGATAA 6960 

GGTCCATCCA GACCAGCTCT TGTCTCAGAT TGATGACTTC CGTAATCAAA TGGACTTTAA 7020 

GGAAATTGTT CCAATCTCAG CCCTTCAGGG AAATAACGTG TCTCGTCTAG TGGATATTTT 7080 

GAGTGAAAAT CTGGATGAAG GTTTCCAATA TTTCCCGTCT GATCAAATCA CAGACCATCC 7140 

AGAACGTTTC TTGGTTTCAG AAATGGTTCG CGAGAAAGTC TTGCACCTAA CTCGTGAAGA 7200 

GATTCCGCAT TCTGTAGCAG TAGTTGTTGA CTCTATGAAA CGAGACGAAG AGACAGACAA 72 60 

GGTTCACATC CGTGCAACCA TCATGGTCGA GCGCGATAGC CAAAAAGGGA TTATCATCGG 7320 

TAAAGGTGGC GCTATGCTTA AGAAAATCGG TAGCATGGCC CGTCGTGATA TCGAACTCAT 7380 

GCTAGGAGAC AAGGTCTTCC TAGAAACCTG GGTCAAGGTC AAGAAAAACT GGCGCGATAA 7440 

AAAGCTAGAT TTGGCTGACT TTGGCTATAA TGAAAGAGAA TACTAAGTAG AGGTAGGCTC 7500 

ATGCCTGCTT CTTGTTTTTA CAGAAGGAGG ACTTATGCCT GAATTACCTG AGGTTGAAAC 7 560 

CGTTTGTCGT GGCTTAGAAA AATTGATTAT AGGAAAGAAG ATTTCGAGTA TAGAAATTCG 7620 

CTACCCCAAG ATGATTAAGA CGGATTTGGA AGAGTTTCAA AGGGAATTGC CTAGTCAGAT 7680 

TATCGAGTCA ATGGGACGTC GTGGAAAATA TTTGCTTTTT TATCTGACAG ACAAGGTCTT 7740 

GATTTCCCAT TTGCGGATGG AGGGCAAGTA TTTTTACTAT CCAGACCAAG GACCTGAACG 7800 

CAAGCATGCC CATGTTTTCT TTCATTTTGA AGATGGTGGC ACGCTTGTTT ATGAGGATGT 7860 

TCGCAAGTTT GGAACCATGG AACTCTTGGT GCCTGACCTT TTAGACGTCT ACTTTATTTC 7920 

TAAAAAATTA GGTCCTGAAC CAAGCGAACA AGACTTTGAT TTACAGGTCT TTC AATCTG C 7980 

CCTTGCCAAG TCCAAAAAGC CTATCAAATC CCATCTCCTA GACCAGACCT TGGTAGCTGG 8040 
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ACTTGGCAAT ATCTATGTGG ATGAGGTTCT CTGGCGAGCT CAGGTTCATC 


CAGCTAGACC 


8100 


TTCCCAGACT TTGACAGCAG AAGAAGCGAC TGCCATTCAT GACCAGACCA 


TTGCTGTTTT 


8160 


GGGCCAGGCT GTTGAAAAAG GTGGCTCCAC CATTCGGACT TATACCAATG 


CCTTTGGGGA 


6220 


AGATGGAAGC ATGCAGGACT TTCATCAGGT CTATGATAAG ACTGGTCAAG 


AATGTGTACG 


3280 


CTGTGGTACC ATCATTGAGA AAATTCAACT AGGCGGACGT GGAACCCACT 


TTTGTCCAAA 


8340 


CTGTCAAAGG AGGGACTGAT GGGAAAAATC ATCGGAATCA CTGGGGGAAT 


TGCCTCTGGT 


8400 


AAGTCAACTG TGACAAATTT TCTAAGACAG CAAGGCTTTC AAGTAGTGGA 


TGCCGACGCA 


8460 


GTCGTCCACC AACTACAGAA ACCTGGTGGT CGTCTGTTTG AGGCTCTAGT 


ACAGCACTTT 


8520 


GGGCAAGAAA TCATTCTTGA AAACGGAGAA CTCAATCGCC CTCTCCTAGC 


TAGTCTCATC 


8580 


TTTTCAAATC CTGATGAACG AGAATGGTCT AAGCAAATTC AAGGGGAGAT 


TATCCGTGAG 


8640 


GAACTGGCTA CTTTGAGAGA ACAGTTGGCT CAGACAGAAG AGATTTTCTT 


CATGGATATT 


8700 


CCCCTACTTT TTGAGCAGGA CTACAGCGAT TGGTTTGCTG AGACTTGGTT 


GGTCTATGTG 


8760 


GACCGAGATG CCCAAGTGGA ACGCTTAATG AAAAGGGACC AGTTGTCCAA 


AGATGAAGCT 


8820 


GAGTCTCGTC TGGCAGCCCA GTGGCCTTTA GAAAAAAAGA AAGATTTGGC 


CAGCCAGGTT 


8880 


CTTGATAATA ATGGCAATCA GAACCAGCTT CTTAATCAAG TGCATATCCT 


TCTTGAGGGA 


8940 


GGTAGGCAAG ATGACAGAGA TTAACTGGAA GGATAATCTG CGCATTGCCT 


GGTTTGGTAA 


9000 


TTTTCTGACA GGAGCCAGTA TTTCTTTGGT TGTACCTTTT ATGCCCATCT 


TCGTGGAAAA 


9060 


TCTAGGTGTA GGGAGTCAGC AAGTCGCTTT TTATGCAGGC TTAGCAATTT 


CTGTCTCTGC 


9120 


TATTTCCGCG GCGCTCTTTT CTCCTATTTG GGGTATTCTT GCTGACAAAT 


ACGGCCGAAA 


9180 


ACCCATGATG ATTCGGGCAG GTCTTGCTAT GACTATCACT ATGGGAGGCT 


TGGCCTTTGT 


9240 


CCCAAATATC TATTGGTTAA TCTTTCTTCG TTTACTAAAC GGTGTATTTG 


CAGGTTTTGT 


9300 


TCCTAATGCA ACGGCACTGA TAGCCAGTCA GGTTCCAAAG GAGAAATCAG 


GCTCTGCCTT 


9360 


AGGTACTTTG TCTACAGGCG TAGTTGCAGG TACTCTAACT GGTCCCTTTA TTGGTGGCTT 


9420 


TATCGCAGAA TTATTTGGCA TTCGTACAGT TTTCTTACTG GTTGGTAGTT 


TTCTATTTTT 


9480 


AGCTGCTATT TTGACTATTT GCTTTATCAA GGAAGATTTT CAACCAGTAG 


CCAAGGAAAA 


9540 


GGCTATTCCA ACAAAGGAAT TATTTACCTC GGTTAAATAT CCCTATCTTT 


TGCTCAATCT 


9600 


CTTTTTAACC AGTTTTGTCA TCCAATTTTC AGCTCAATCG ATTGGCCCTA 


TTTTGGCTCT 


9660 


TTATGTACGC GACTTAGGGC AGACAGAGAA TCTTCTTTTT GTCTCTGGTT 


TGATTGTGTC 


9720 


CAGTATGGGC TTTTCCAGCA TGATGAGTGC AGGAGTCATG GGCAAGCTAG 


GTGACAAGGT 


9780 
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GGGCAATCAT CGTCTCTTGG TTGTCGCCCA GTTTTATTCA GTCATCATCT ATCTCCTCTG 9840 

TGCCAATGCC TCTAGCCCCC TTCAACTAGG ACTCTATCGT TTCCTCTTTG GATTGGGAAC 9900 

CGGTGCCTTG ATTCCCGGGG TTAATGCCCT ACTCAGCAAA ATGACTCCCA AAGCCGGCAT 9960 

TTCGAGGGTC TTTGCCTTCA ATCAGGTATT CTTTTATCTG GGAGGTGTTG TTGGTCCCAT 10020 

GGCAGGTTCT GCAGTAGCAG GTCAATTTGG CTACCATGCT GTCTTTTATG CGACAAGCCT 10080 

TTGTGTTGCC TTTAGTTGTC TCTTTAACCT GATTCAATTT CGAACATTAT TAAAAGTAAA 10140 

GGAAATCTAG TGCGAGTAAA AATCAATCTC AAATGCTCCT CTTGTGGCAG TATCAATTAC 10200 

CTAACCAGTA AAAATTCAAA AACCCATCCA GACAgATTGA 10240 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13206 base pairs 
(5) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 



CGCTTTATCG 


TGGACGTGGT 


CAAGCCGAGA 


ATTTCATCAA 


GGAGATGAAG 


GAGGGATTTT 


60 


TTGGCGATAA 


AACGGATAGT 


TCAACCTTAA 


TCAAAAACGA 


AGTTCGTATG 


ATGATGAGCT 


120 


GTATCGCCTA 


CAATCTCTAT 


CTTTTTCTCA 


AACATCTAGC 


TGGAGGTGAC 


TTCCAAACTT 


180 


TAACAATCAA 


ACGCTTCCGC 


CATCTTTTTC 


TTCACGTGGT 


GGGAAAATGT 


GTTCGAACAG 


240 


GACGCAAGCA 


GCTCCTCAAA 


TTGTCTAGTC 


TCTATGCCTA 


TTCCGAATTG 


TTTTCAGCAC 


300 


TTTATTCTAG 


GATTAGAAAA 


GTCAACCTGA 


ATCTTCCTGT 


TCCTTATGAA 


CCACCTAGAA 


360 


GAAAAGCGTC 


GTTAATGATG 


CATTAAAGAA 


CAGTCGAGAT 


GAAAAAATCG 


TGTGACGCAC 


420 


CAAGGGAGGA 


GTCTGCCCTT 


TTGAGGAAAT 


CTAGCGAGGA 


AAAACGATAC 


TGGAACAGCA 


480 


GAAAGTAAAA 


CTGACCTCAT 


GAGGAGGAAG 


AAAGTGGCTC 


ATGAGGTCAG 


GGGTTTTG'i\A 


540 


AGTTACATCT 


AGTTGAGAGA 


GGTATGAATG 


ATTTGGGATT 


AATCATTTCT 


TGTTTTAAAT 


600 


CAGGAGAATA 


GTAACGATTT 


TTTCCTTTTT 


TGACGAACTC 


TATTCCGTAA 


CGATCAATCA 


660 


ATTTAATCAT 


GTACCTAATA 


TTAGAATTGT 


TTATCCCAAA 


TTTATTTGAA 


AGCTTCTCTA 


720 


AGCTATATCC 


TTGTTTTCTA 


AGTTCATAGA 


TCTGAACTTT 


ATCATCATAA 


GTTAGTTTCA 


780 


TAATAAAAAC 


ACCCCAAAAG 


TTAGATTTTT 


TCTGTCTAAC 


TTTTGGGGGG 


CAGTTCATTC 


840 


AACACCTGAT 


ACTATGCGTT 


TTTCTTATTT 


GAAATACTTT 


TTACTCAACC 


TCTTTATACT 


900 


CAATGAAAAT 


CAAAGTGCAA 


ACTAGAAAGC 


TAGCCTCAGG 


CTGCTCAAAA 


CAGTGTTTTG 


960 
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AGGTTGCAGA TGGAAGCTGA CGTGGTTTGA AGAGATTTTC GAAGAGTATT ACTTAATCTT 1020 

CTTGATACTT TGACTAAGAA TAAATCCTAC AATCATCCCT ACCATATTTT GCATAAAATT 1080 

CGGTAGAATT TCTGGGAGGG CTGCTGCCCA GCCATTCATC AAAGCAGAAC CCAAGGCGTA 1140 

GCCTCCTACC ATGGCAATAG TTGCTAAAAT AAGGCCTAAC CACTGACTTT TTCCTTTAAA 1200 

TCCTGCGAAA AATCCCTGCA AGCCATGGTT GACCAAGCTA AAGAACATCC ACTGAGGGTA 1260 

GCCTGATAAG AGGTCAATCA AGAAACTTGC TAGTCCTCCG ACTACCGCTC CTTCACGACT 1320 

ACCAAAGTAA AAGGCCGCAA AGAAGACACC AGCATCTAAA AGAGTTAGAA TTCCTGTAGG 1380 

TGTTGGGATT TTTAAGAAAT AACCTAGAAC CACAGAAAGG GCGGTTAATA GGGATACAAG 1440 

GGCGATTTTA GTTGTTTTTG TTTGCTTCAT ATTGTCTTAC TCCATACTGA TCTGCTTGTG 1500 

CAATAGCACG ATAAACGAAA GCCTTAGAGC TTTCTACTGC TGGCAAAAGT TTATCACCTT 1560 

TAACCAGGTG ACTGGCAATC5 CTAGAGsCAA AGGTACAACs TGCACCAGCA TTTTGGCCTT 1620 

GGATAACTGG ATTTTCTAGG ATAGTAAAGG TCTGTCCATC ATAAAAGACA TCCACAGCCT 1680 

TGTCCTGACT AAGACGATTG CCTCCCTTGA TAATGACTGt GGCGCTCCTA AATCATGCAA 1740 

TTTCTGCGCT GCAGTTTTCA TGTCTTCCAA GGTTTTAATT TCCTGACCGG ATAATAATTC 1800 

TGCTTCTGGG AGATTAGGCG TAATCACACT GACATAAGGG AAAAAGCGAA TCAACTCTTG I860 

GCAGAGCTCA CTGACAGCTA CATCATGCGT TTCCTTGCAG ACCAAGACAG GATCCAACAC 1920 

CACAGGTACT CCTGGGCGTT GTTTGATAAA GTCCAAGGCC TTCTCAGCCA CGCTGACAGT 1980 

AGGGAGAAGA CCAATCTTAA TTCCCCCAAA TTCCACATCA CGCAAGCTAT CTAATTCATG 2040 

TTGAAAAATG GTATCATCAG TTGGAAAGAC TTCAAATCCT TTTTCTGTCA AGGCTGTCAA 2100 

ACAAGTCACT GCTACAAACC CATGCAAGCC GTTCAAGGTA TAGGTAGCCA AATCAGCTGA 2160 

CAGTCCACCA CCACTAAAAA TATCATTTCC AGAAAGTGCT AAAATACGAT TATTCTTCAT 2220 

AACGAATCTC CTTTAAATAC AAACCATTTG GTGCTGCAGT GGGACCTGCA AGTTGCCTGT 2280 

CCTTCTTCTC CAAGATGAGA TCAATCTGCT CTACTGGCAT GCGGTTGTTA CCGATTTTGA 2340 

GAAGAGTCCC CACCATATTG CGAATCTGTT TATACAAGAA ACCATTTCCT GAAAAGGTAA 2400 

AGGTCAAAAA TTGTCCTGTC TCATCGACTA TTAAACTAGC TTCTGTGATG GTGCGAACCT 2460 

TATCCTCTAC ACTAGTCCCA GAGGCTGTAA AACCGGTAAA ATCATGGGTT CCCTCTAGCT . 2520 

TTTTGATTGC AATCTGCATT CGTTCCACAT CGAGTGGGTA GGGAAAGTGG GTGGCATAGT 2580 

GACGGCGCAT CGGATTTTTG GGACGTCCTC TATCCACAGT AAACTCATAG GTCTTGCTAT 2640 

GCTTGGCATA ACGGCAATGA AAATCATCTG CCACAAGCTC AATCGAAATC ACATCAATAT 2700 
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CTTCAGGAGA CTGGGTATCC AAGGCAAAAC GGAGTTTCTC CTCATCCATC TGATAAGGCA '2760 

GGTCAAAATG AATCACCTGT CCCAGGGCAT GAACCCCACT ATCTGTCCTA CCAGCACCGT 2820 

GAACAGTAAT GGCTTGCCCT TTATTTAATC TGGTCAAGGT TTTTTCAATT TCTTCCTGAA 2880 

CGCTACGCGC ATGAGGCTGG CGCTGAAAGC CAGCAAAGGC ATAACCATCA TAGGAAATAG 2940 

TTGCTTTATA TCTCGTCATA GCCTCTATTT TATCAAGAAA TTAGTCTGTA AACAAGGACC 3000 

TAAAACAAAT ATTGTATGGG TATAAAAATC TCATACTCTT CGAAAATCTC TTCAAACCAC 3060 

GTCAGTTTCC ATCTGCAACC TCAACACACT ATTTTGAGCA ACCTGCGGCT AGCTTTCTAT 3120 

AGTAGATTGA AATAAGATAT GAACAACTCT ATTAGGAAAG TCAAATTAAT TTCTAGAAAT 3180 

ATTTTAGCAG CTACAGCGTA CTATTCCAAA CTCAATCAAC TATAGTTTGC TCTTTGATTT 3240 

TCATTGAGTA TCAAAAGAAA AACTTAGGAA TCAATCCTAA GCTCTCTTCT GAAGTAGGTA 3 300 

CATGACAAAG ATAGAGATTA CAATCAACCA ACCTCCTAAG ATACTAAAGA CCAACATCCC 3360 

ATTGTGAGTT AGTAAGCCAA TTGCACCTAG AACGAATGGG GTCGTAAAGG CTCCGAAACT 3420 

ACAGCCTAAT ACAGCAAATG AAGTTGCTTG ATTGAGGAGT TTAGCTGGAA TTCGTTCAGA 3480 

GACAAGTTGA AAGACCGTCG TCAAGACTAC ACTATAGGCA AATCCAGCCA GAACACTTCC 3540 

TGCTACTACC ACCCACAAGG ATGAAGACAA GGCAATCACG ATTTGCCCCA AGCCAAAGGT 3600 

AATACCAGAC CAGAGGAGCA GTTTCTCTTT AAAGATAGAA ATCAAGAAAG AAAAACTCAC 3660 

CCCAGCCACA ATCCCGATCA ACTGCATGAT ACTAAGAACA AAACTAGATA ACTGGGCATC 3720 

CCCCAATCCT CTTTCCACCA TCAAACTTGG AATACGGATG GTAATAGCTG TATTGGTACA 3780 

AACTACAACT GCCGCTTCGA TAGCTAAGGT AAAAATCAAG CCTTTCATTT CTCGAGTTAA 3840 

ACGACTTGCT TCCTTCGCTC TTTTCTTGAC TTCTTTCTTT GATTTTCCAT AAGGGACAAA 3900 

GAGCAGATAA AGGGGCAGCA CCAAAAATCC AGCACTATAG GCTAGAAAGA TAGCTGTCCA 3960 

ACCAAAGGCC AACAACTGAC CGACGGCCAA GGTAATGAGA GAAGCTCCAA CGACCTCTGC 4020 

AGAAGCGCGT AGCCCTAACA TCTGAATTCG CCTTTTTCCT TGGTAGCGTT CACTGATAAT 4080 

AGAAATGGCC TTGGCATTGA TCATCCCAAG ACCCAAACCA AAGAGAAGCC GTGTTCCAAA 4140 

GACAAAGGGA TAGGCTTGGT ACCAGAAGGG AGCTGTACCG CTCAATGATA AAATCAGCAA 4200 

GCCCAAACTA ATCTGTAAGC GCTCAGGAAA TATTTTTTCT AAGAAACCAT TTAGCAGTAA 4260 

CATCATCATG ATTCCAAAGG AAGGCAAGCT CACCAAGAGC TCAATTTGTT CCTTAGAATA 4320 

ACCCTGATAA TAGTCAAACA TGGCTGGTAG GGCACTCGAA ATGGAAAAGG AGGTAATCAA 4380 

AACGAGGGAG AGAGCCAAAA TGCTGGCCCG TTCTAAAAAT TGTTTCATGA AATCTCTTTC 4440 

TATATTTCTC TTAATCTTCT ACTTTTTTGA TAGTTATCAA ATAAGCAAGA AAAGAAGAAG 4500 
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CCTCATTGGT TTGTAGACTC CTTCTTAAAT 
AATGAAAATC AAAGAGCAAA CTAGGAAGCT 
GGTTGCAGAT GGAAACTGAC GTGGTTTGAA 
CTCTTGATTT GCTTGATAAA GTAGAAAATA 
ATCAGACACC ACTTAAACAC AACATTCCAA 
AAAGGATTAT CCTTGGCATT TGGAATATTG 
AACATCATAT ACAGAAAGGG TAAAATGGTC 
CCCTGTTTGT CAAAAAAGAG GGTATCCGCT 
AGGAAATTTT CTAGGGTATA GAAATTAGTC 
ACACAGGTAA TCATGATACT CATGGTGACC 
TGCCAATTTT CACCTACACG GCTCATAACC 
AAGAGGTTGG ACAGAACCGT GTAATAGAGA 
TCAAGATAAA CTCCCGTAAA AGCCGCTAGA 
TTATAGTGTT TTGACATGCT TAAATCTTCC 
CAAAACCATC AATCTTACAG TCGATATTGT 
CGCGCGTCCC TTGTTTCAAA TCTTTTGGCG 
TTACTGTATC ACCATCAGCC AATTTATTTC 
CTACTTCTGC AACTTCAGCA GGATTCCACT 
CACCGTCTTC GTAGACATAC TCTGAGTTAC 
TTTCTCCTTA TCATCATTCA CTATTCTTTG 
ATACCCTAAA ATCAGCATTT TGACAAATTT 
TTCTACATTT ACATTCTTTT TTCAGCTTCT 
TTCAAACCAG TTGTATCAAG GTAGACAGCA 
CGATGACTAT CCTTGTAGTC ACGCGCAGCA 
TCAATTCCCT TGGCAATATT TTCCTTGTAA 
ACTAGGAAAA TTTTCAATTC TGCTTGTGGC 
ATGACAATCC CGCCTTGCTG GGCAATTTCT 
TGAGGAATTG CTGCAATAGC AGAAACATGA 
GTAATATCCA CATCTCCTAC AAAAACAAGC 



335 

TCGAAAATGA ATCCCTTGTA 
AGCCGCAGGT TGTTCAAAAC 
GAGATTTTCG AAGAGTATTA 
AATCCTGCTA CCATATAGGC 
CCCTTGTTCA CATTCAAAAA 
AGTTTTAGAA CCAAGCCATT 
CACACTGCTG GATCCCAAAT 
AAAAACCAGA TGGGAACGAT 
GCAATGGGCG CCAAGAGGAA 
CCACCTTTTA AGCGCAAGAG 
TTTAGAAGAT AAAGGGTAAA 
AGCATCCCAA AACCACCATG 
AACAAGAAGA TACGGCTATA 
TCACAAACTC TGATTTAAGT 
GGTCGCCTTC TACGATGCGG 
CACCTTTTAC TTTCAAGTCC 
CGTTGGCATC GATAGCGACA 
CATGAGCACA CTCTGGGCAA 
ATTTTGGACA ATTTGGTAAA 
AAAATCAAAA TTTCTCGAAC 
AGAAAAAAAC CGATATCAAT 
GCTTTGATTT TTTCAACTAC 
TCCTCTGCTT GTTTGAGAGG 
ATTTCCTTTT TTAGGGTTTC 
CGACGCTCTG CTCTCTCATC 
AATACAACAG TTCCAATATC 
TGTTGGAGAG AAACCAGTTT 
TTGGTCACTT CATTTTCACG 
TGGTCTCCAG TTTCTGAACG 



TCTTATACTC 4 560 

AGTGTTTTGA 4620 

GGATGACTTT 4 680 

AACAAAGATA 4740 

GAAGTAAGGG 4800 

AAAAAGAGCA 4860 

CTTGTATTGA 4920 

ATAGTGGCAA 4980 

ATGGTAAATC 5040 

ACTTGGCCTT 5100 

AATAGTTACC 5160 

CTTAGTAATT 5220 

AAATACAAGT 5280 

TTCATGGCAC 5340 

ATATTTTTCA 5400 

TTGATGAGAG 54 60 

AGACCTTCTT 5520 

ACCAGTAGGG 5580 

TTGTTCATGG 5640 

AGCAACTATT 5700 

CTATCGGCTT 5760 

TTCTTGAATG 5820 

AGAAGTCTCA 5880 

AAGGTCTGTT 5940 

AACAGAAGCT 6000 

GCGACCATCC 6060 

CTCACGCACT 6120 

GATAGGATGG 6180 

TCCAAAGCTG 6240 
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ATTGGATGCT GGTCCAACAA GGCTAGAAGG GCTTCGACTT CTTCAACTCC TAATTGGTTC 6300 

TTAAGAGCCA TATAGGTCGC TGCACGATAC ATAGCTCCTG TATCAAGGTA GGTGAATCCA 63 60 

AAATCCTTAG CAATAATCTT TGCGACCGTA CTCTTACCGC TGGAAGCAGG ACCATCAATA 6420 

GCAATTTGAA TTGTTTTCAT ATCGGCTCCT ATTTTATTTT TATAACATCA CCTGGATTAG 64 80 

CAAACCAAGA TCCTGTAGCC ATGTGCCCAG GATTCAAGGC CTCTAACTGA GCAATGGAGA 6540 

TTCCTGCACG AGCGGCAATA GCTGCTTCCC CTTCTCCTGC GAGAACTTTA ATCGTTCCTT 6600 

CAGGATTAGC AGCTTCTTCT GAACTACTAG AAGTAGATTC TGGCTCTGAA CTCTGCTCAG 6660 

GCTGAGAACT ACTTGAAGAT GAGATTTGTA CTACACTGGC ATCAGAATCA TGAAAGCCTT 6720 

TTAAGGCTGC TGTGCGATTA CTCCCCCCCG ATGATAGATA GATGAGAACG ATGACCATCA 6780 

CCACCACAAT TACAAAGAAA ATACTAGCTA GGATCGTCAA AATACGATTA GCCATCCTAT 6840 

CAGCCCCTCC GTGGTTTCGA TGCCGACGCT CTGCTCTTGA TTCTTCTTGA TCATAGATAT 6900 

CTTCTTGCCA CGGTTCTTTT GCCATACCTT ACTCCTTGTT TTTTTTTACT TTTCTTATTA 6960 

CAATATAAAT ATGAACATGA AAATCACACT TATACCTGAA CGATGTATCG CCTGTGGGCT 7020 

TTGCCAAACT TATTCTGATT TATTTGATTA CCACGATAAT GGAATCGTGC GTTTTTACGA 7080 

TGACCCTGAC CAACTGGAAA AAGAAATTTC TCCTAGTCAG GATATCTTAG AGGCTGTTAA 7140 

AAATTGCCCA ACTCGCGCCC TGATTGGAAA CCAGGAAGCC TAAATCAATG GCGATAATCC 7200 

ACTCCCTCTA GTTTAGCACA TTTCCATGTA AAATTATAGT CTTTTCACTT TATTTTTTTC 7260 

TGTAAAATCA GGAAGGTCAC TTTTTTCTTT GATAAGATAA AGTGGTCTTT TTTTAGTCTC 7320 

TAAATAAATC TTACTGATAT ACTTGCCGAG AATCCCAATG GTCAAGAGTT GAATGCCTCC 7 380 

AAGAAAGAGA ATAACAGCCA TCAGAGAGGT CCAACCAGAT GTCGGATTGC CCAAAATGAG 7440 

GGTCCGAACC ACAACAAAAA AGGTCATCAG CAGAGAAAGA AAACAAGATA GGAGACCAGC 7 500 

TACAAAGGCT ATAATCAAGG GAAAATCTGA AAAATTAATA ATCCCTTCAA TGGAGTAGAA 7560 
AAAGAGTTGC CTAAAACTCC AACTTGTCTT GCCAGCCTGC CTTTCGACAT TTGGATAG'i'C . 7620 

CAAATAGTAG GTTTTGAAAC CCACCCAGGC GAAGAGCCCC TTTGAAAAAC GATTGGACTC 7 680 

GGTCAAGCTT AAAATGGCAT CGACTACAGA CCTTCTCATC ATACGAAAAT CACGGACACC 7740 

CGACGGCAGA GCTACTGGGC TGATTTTTTG CATGAGGCGA TAAAAGAGAA CAGCACAGAA 7800 

ACTGCGAAAG AAGGGTTCTC CCTCCCGACT AGTTCTCCGT GTCCCAACGC AGTCCAAGTC 7 860 

TACATTTTTG TCTAATACAT TTTTCATCTC AAACAACATA CTAGGAGGAT CTTGGAGGTC 7920 

TGCATCCATC ACCACCACCA AATCTCCTGT CGCATATTGC AAGCCTGCAT AAAGGGCTGC 7980 

TTCTTTGCCA AAATTTCGAG AGAAAGAAAT ATAATGGACT GCCGGATTTT GCTCCCGATA 8040 
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GGCCTTTAAG AGTTCCAAGG TCCCATCACT TGATCCATCA TCGACAAAGA CATACTCGAT 8100 

TTCTGTTTCC AAATCTGGAA GTAAAGCTTC CAGAGCCTGA TAAAAAAGAG GAAGTACTTC 8160 

CTCTTCGTTT AAACAAGGGA CGATGATTGA AATCATCATC TTAGTCTTCA AATCCATTTG 8220 

GATGCTTGCT TTGCCAACGC CATGCGTCTT CACACATTTG GGTGATGTCG AGTTCTGCTT 8280 

CCCAACCGAG TTCTGCTTTA GCTTTTGCCG GGTCTGAGTA GCAGGCAGCG ATATCACCTG 8340 

GGCGACGTTC TACGATGCGG TAAGGAATAG GACGGCCCAC CGCTTTTTCC ATGTTTTGGA 8400 

TAATTTCAAG AACTGAGTAA CCTTTACCAG TTCCAAGGTT ATAAACGTTT AGTCCTGAAC 8460 

CTTTTTGGAT TTTTTTCAAA GCTGCAACGT GACCCTTAGC CAAATCGACA ACGTGGATAT 8520 

AGTCACGAAC ACCTGTTCCA TCTTCCGTAT CGTAATCGTC TCCAAACACT TGCACTTGCT 8580 

CTAATTTTCC AACGGCTACT TGAGTCACAT ATGGCAAGAG ATTGTTTGGA ATACCGTTTG 8640 

GATTTTCTCC CAAATCACCA CTCTCATGGG CTCCGATTGG GTTAAAGTAA CGAAGCAAGA 8700 

CAACATTCCA TTCTGAGTCT GCTTTGTAAA TATCAGTCAA AATTTCCTCT AGCATGAGCT 8760 

TAGTACGACC GTATGGGTTG GTCACTGAAA GTGGGAAATC TTCCAAGATG GGCACTGTGT 8820 

GCGGATCCCC GTAAACTGTC GCAGAAGAAC TGAAGATGAT GTTTTTACAG TTGTTTTCTT 8880 

CCATGGCTTT CAAAAGGCTG ACAGTTCCAG CGATATTGTT GTCATAGTAG GCAAGAGGGA 8940 

TACGTGTTGA TTCGCCAACA GCCTTCAAAC CAGCAAAGTG AATGACACCA GTCGGTTCTT 9000 

CCTGCTTGAA AATATCTCTG AGGGTATCTG TGTCACGAAT ATCTGCCTCA TAGAAAGGAA 9060 

TCTCAACTCC TGTGATTCCT TCAACAACTT CTAAACTCTT ACGATTGCTA TTGACAAGAT 9120 

TATCCACCAC AACAACTTGA TGACCTGCTT GGATCAATTC AATAACAGTG TGGGTTCCAA 9180 

TAAAACCGGC ACCACCAGTT ACCAAAATCT TTTCTTGCAT CTTTTTTCCT CGATTCTCAG 9240 

ATTATTTTTT CTTATTTTAC CATTTTTGAC AGGGAATGTC ATTTGCCATC CTAAACTACC 9300 

TGATAAAATT TCAGTAAAAT GCTTATACTC TTCGAAAATC CAATTCAAAC TACGTCAACG 9360 

TCGCCTTGCC ATGGGTATGG TTACTGACTT CGTCAGTTCT ATCCACAACC TCAAAACAGT 9420 

GTTTTGAGCT GACTTCGTCA GTTCTATCCA CAACCTCAAA GCAGTGCTTT GAGTAACCCG 9480 

CGGCTAGTTT CCTAGTTTGT TCTTTGATTT TTATTGAGTA TTATTCGCTT TTTACTCGTT 9540 
TGACATAGTT TTCAATTGGG TAATTTAGAG GGTCCAAGGT CAACTCCTTG TCTTGGATCA - 9600 

GTTGGGCTAG ATGGTAACCA ATGATAGGAC CAGTTGTGAG GCCTGATGAA CCTAGTCCAC 9660 

TGGCTGCATA GACACCAGTT AAGTCAGGCA CCTGCCCAAA GAAAGGAGAG AAATCACTGG 9720 

TGTAGGCACG GATTCCAACA CGCTCAGATT TTGAAGTAGC TTCAGCCAAA ATCAGATAGT 9780 
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GAGTCAAGGT GGCCTCCTCC ATTTGTTGGA GCAAGGTTTC ATCTACCGTC AAATCAAATC 9840 

CCATGTCATT TTCGTGGGTA GCGCCTAAGG ATAATTTCCC ACCTGCAAAG GGAATCAAAT 9900 

CCCACTCCCC TTCTGGCATG ACAACAGGGT AATCTTCCAT GTCTTGGGCA AGCTGATAAT 9960 

CTCGTAGTTG TCCTTTTTGA GGACGGACAT CCACTTCATA ACCTAAAGGC TCTAACATGT 10020 

CCCCCAACCA AGCTCCCGTC GCCAAAATAA CCTGCTCAAA CTCCTCTTCA CCAATCTGGT 10080 

AGCCTGATGC TAACGGTGTC AGAGTCACTT TTTCTTTGAC CAGCTTGACA TGACTGACTT 10140 

CCAGCAAACG AGTCACTAAA AGTTGGCCAT CTACTCTCGC TCCACCAGAA GCATAGAGCA 10200 

GGCGGTCAAA TCCCTGCAAA CCAGGGAATA ATTCATTAGC TGAGGCTTGG TTCAGAATGG 10260 

CTAATTGCCC TATCAAGGGA GATTCTTCTC TGCGCTGGAG GGCCAGTTGA TAAAGTTCTT 10320 

CCAAATTGGA TTCATCCTTT TTCAAGAGAA AGACTCCCGA ACGCTGGTAA AAGTCGATTT 10380 

CTTGTCCTGA TTTCTCTAAA TCAGCTAATA AATCCACATA AAAATCAGCC CCCAAGCGCG 10440 

CCATCTTGTA CCAGGCTTTA TTACGGCGTT TGGAAAACCA AGGACTGATA ATTCCTGCTG 10500 

CGGCCTTGGT GGCTTGACCT TGCTCATGGT CAAAAACGGT CACCTCTAGG TCACTTTCTC 10560 

TCGAGAGGTA GTAGGCAGCT GTTGCTCCCA CAATTCCTGC TCCAATAATG GCAACTTTTT 10620 

TCATTGTCTT CACTTTCTAA CTAGATATGA TGGAAAGGAT TGGTTGATGC CTGACTAGGC 10680 

AAGATATCAA TAGACCACCC CTTATCTTCC TTCCATTGAC TAAGAAGTGC TGCGATTTTT 10740 

TCTACAAAAA TCACTTCGAT ATAGTGACCT GGGTCCAATG CAAGCAACCC ATCAGATAGC 10800 

ATATCCTGAG CAGTATGGTA GTAGATATCA CCAGTGATAT AGACATCTGC CCCCTTTGCC 10860 

AAAGCATCCT TATAGAAAGA CTGCCCGCTT CCACCACAAA TTGCTACTCT TGAAATAGGC 10920 

TTCTGCAAAT CATCCTCTTG ATAATGCACC ATTCGAAGGC TATCTAGGTC AAAGACTTGC 10980 

TTGACCTGTT GGGCCAATTC CCAAAATGTC TGAGGCTGAA TATTCCCAAT ACGTCCAATT 11040 

CCACGTTCTG GACCTGTTTC CTGCAGATAA GTCGTCTCCT CGATTCCTAG CATCTGACAA 11100 

AACCAGTCAT TGAGCCCATT TTCAACGATA TCAATATTGG TATGGCTGAC ATAAACTGCG 11160 

ATATCATGCT TAATCAGGTC GATGTAAATC TGATTTTGCG GACGGCTGGC AAGCAAGTCC 11220 

TTGATAGGAC GAAAGATAGG CGCGTGCTTG ACGATAATCA AGTCCACACC CTTTTCAATG 11280 

GCCTCTGCCA CTGTCTCTTC ACGAATATCG AGGGCAACCA TGACCCTTTG GATACCCTTG 11340 

TCTAAAGTGC CAATTTGCAG ACCACGGCTG TCTCCCTCCA TAGAAAATTC CTGAGGGCAA 11400 

AAGGCTTCAT AAGCTTGGAT CACTTCACTT GCTAACATGG AGCACCTCCT TGATAGCTTG 11460 

AATCTTATCT ACTAGAACTT GACGTTCTTC CAGATTTTTT TCTGGGATTT GTCCGAGGGC 11520 

GAACTCTAGC TTCTCAGCTT CTTTTTGCCA TTTTTGGACA AATACTGGAC TGACTTCTTT 11580 
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GGACAAGAAG GGACCAAAGC GAACATCACT GGCTGATAGC TTCATTTGTC CTGCTTCCAC 11640 

CACCAAAATC TCATAAAACT TTCCAGCTTC TTCTAAGATG CTTTCTGCTA CAATCTGGAA 11700 

TCCATGATCC TGTAGCCAGA TACGCAAGTC GTCTTCACGA TTATTGGGCT GGAGGATCAA 117 GO 

ACGCTCTACA TTAGCTAACT TCCCCAAACC TTCTTCTAAA ATCCTAGCAA TCAAACGACC 11820 

ACCCATGCCA GCAATGGTAA TGACAGACAC TTGGTCAGTC TCTTCAAAAG CTGCCAAGCC 11880 

ATTGGCTAAA CGGACTTGGA TTTTCTCCTT TAGGCCGTGA GCCTCAACAT TTTTAACCGC 11940 

AGACTGATAG GGACCTTCCA CCACCTCACC TGCAATAGCG CTTTTGATTT GGCCTCTCTC 12000 

AACCAACTCG ATAGGCAGAT AAGCATGGTC ACTTCCCACA TCTAGTAAAA TAGCCCCCTG 12060 

TGACACAAAG GAAGCTACCA ATTCTAATCT CTTTGAAATC ATCTTCTCTC ACTTTCCAAA 12120 

ACTCTATTAC CTCTTATTAT ACCACATTTC AATCTTCAAC TTCCCAGTAA TATAAGCACC 12180 

TCTGGCGAAA GAAGTTTCAA TGTCCTAAAG TAATAAGTGA ATCCAATTGA AAGATTTTAA 12240 

ACAATTTGCA AAAATGTCAA AAAATAAAAA ATAAACAGTT TATTCAGAAA ATTCTTGACA 12300 

TATAAAAACA CATGGTAGAA TATAATTAGA AAGTTAGAAA AAATAAAAGT TTGACTAAAA 12360 

TTTGTATTTG AAGGTGGTGT TCAGATAAGA AATTTAGTCA GACGAACCAC GAATTTGCTC 12420 

TATGCTTTCT GGAATTTATC ATAACAGGAG GATACAGTCA TGGAACAAAC ATTGTTTGAA 12480 

TTAGAACTAC TTCCAGAGGA AGATATCATT GTCACAGGTC TCCCTAAGTA TTGTTCTTTT 12540 

ACTTGTTTAA TTACAGGTCG CTAGTTATAT TTTATATAAA ATAAGTAGCT TTACTTACGG 12 600 

AATAGGCTAG TGCTGTGTCT CTAGCCTATT TTAATAATTA GGAGTTTGTT ATGGATTTAT 12660 

TAGAGAAAGA ATGTTTAAAA TGTGATAAAA ATTTCCAACA GGGTGATATT TGGAATTACT 12720 

ATTATTTATC AGATAAGATG CCTGCACAAG GGTGGAAAAT ACACATAAGC TCCCAAATAA 12780 

AAGACGCTGT AAATATTTTT AAGATTGTGT ATAAACTATC CCAACTAAAT AATTGTAGCT 12840 

TTAAAGTTGT TAAAAATTTA GAGGAATTAA AAAAAATTAA TTCCCCTAGG GAAATGAGCC 12 900 

CTACTGCTAA CAAATTTATA ACTCTATATC CTAAGTCAGA ATCTGAAGCT AAGAGTATGA 12960 

TTTGTAATCT TACGAATAGA CTGTCAGAAT TTAAGGCTCC AAAAATACTA TCTGACTATC 13020 

AATGTGGAAT GCATTCTCCA GTTCATTATA GATATGGGGC TTTTTTAAAA AAACAAGCTT 13080 

ATGATGAAAA AAATAAAAAA GTCATCTATT TATTGCTAGA TGAAAAAAGG AAGAACTATG - 13140 

TAGAAGATAA GAGACAAAAT TTCCCTAGTC TTCCTAGCTG GAAAATGGAT TTATTTTCAG 13200 

AAGAAG 13206 
(2) INFORMATION FOR SEQ ID NO: 34: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13104 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34; 
CCGGATCCAG CGAAAAATAT GCTCTTTGAT GCTGTAAGTG GTCAAAAAGA TGCTAAAACA 
GCTGCTAACG ATGCTGTAAC ATTGATCAAA GAAACAATCA AACAAAAATT TGGTGAATAA 
AAAATTTGTT CAAGGGGGGT GGAAATCAAA TCCCCCTT^ AATTTATCAA TAGAGACACA 
AATAATTTAG CTTTCTTATA AAAAAGTAGT ATCCTATGAA AGGAGTTAAT ATGGAAAAGC 
AACAACCTAG TAAAGCAGCC CTGCTGTCTA TCATTCCTGG GTTAGGACAG ATTTACAATA 
AACAAAAAGC CAAAGGTTTT ATCTTCCTTG GTGTAACCAT CGTATTTGTC CTTTACTTCC 
TAGCACTTGC AACCCCTGAA TTGAGCAACC TCATCACTCT TGGTGACAAA CCAGGTCGTG 
ATAATTCCCT CTTTATGCTG ATTCGTGGTG CCTTCCATCT AATCTTTGTA ATCGTTTATG 
TACTCTTTTA TTTCTCAAAT ATCAAAGATG CACATACGAT TGCAAAACGC ATTAACAATG 
GAATTCCAGT TCCACGCACA CTCAAAGACA TGATCAAAGG GATTTATGAA AATGGCTTCC 
CTTACCTCTT GATCATTCCA TCTTATGTTG CCATGACCTT CGCGATTATC TTCCCAGTTA 
TCGTAACCTT GATGATCGCC TTTACCAACT ACGACTTCCA ACACTTGCCA CCAAACAAGT 
TGTTGGACTG GGTTGGTTTG ACCAACTTTA CAAACATTTG GAGCTTGAGT ACCTTCCGTT 
CTGCCTTTGG TTCTGTTCTT TCTTGGACTA TCATTTGGGC TTTGGCAGCT TCTACTTTAC 
AAATCGTAAT TGGTATCTTC ACAGCTATCA TTGCCAACCA ACCATTTATC AAAGGAAAAC 
GTATCTTTGG TGTTATTTTC CTTCTTCCTT GGGCTGTCCC AGCCTTCATC ACTATCTTGA 
CATTCTCAAA CATGTTTAAC GATAGTGTCG GTGCTATCAA CACTCAAGTA TTGCCAATCT 
TGGCTAAATT CCTTCCTTTC CTTGATGGAG CTCTTATTCC TTGGAAAACA GACCCAACTT 
GGACTAAGAT TGCCTTGATT ATGATGCAAG GTTGGCTCGG ATTCCCATAC ATCTACGTTC 
TGACCTTGGG TATCTTGCAA TCTATTCCTA ACGACCTTTA CGAAGCAGCT TATATTGACG 
GTGCCAACGC TTGGCAAAAA TTCCGCAACA TCACTTTCCC AATGATTTTG GCTGTTGCGG 
CACCTACTTT GATTAGCCAA TACACCTTCA ACTTTAACAA CTTCTCTATC ATGTACCTCT 
TCAATGGTGG AGGACCTGGT AGTGTCGGAG GTGGAGCTGG TTCAACCGAT ATCTTGATCT 
CATGGATCTA CCGTTTGACA ACAGGTACAT CTCCTCAATA CTCAATGGCG GCAGCTGTTA 
CCTTGATTAT CTCTATCATT GTCATCTCAA TCTCTATGAT CGCATTCAAG AAACTACACG 
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CATTTGATAT 
CAAAGCCTTA 
ATTACCATTA 
ATCGACCTCA 
TACCTCAACA 
' CTTGCTGGTT 
TTCTTGATCA 
CTTATGTTGA 
ATCCCGATGA 
GAATCTGCAA 
CTTGTTCGCC 
TACATCCTCT 
CAAACCTTCG 
CTCATCGCCC 
CTTACAAGTG 
TTCGAAAATC 
AACCTGTGGC 
ATATCCTTGT 
ATAATACTCA 
TATTTTTCAA 
TTTCAACTTC 
CAAAACAGCT 
CTGACAGATA 
ACTTATACTG 
GAATTACCTA 
AGCAAGGAAC 
GACAGCTTGA 
CTCTTCCTAG 
GGAGCTAGCT 



GGAGGACGTC 
CTTACCTTTA 
TGTCAGCCTT 
ATTTTGATAA 
CTTTGATTAT 
ATGCTTACAG 
TCCAAATGGT 
ACGCCCTTAA 
ATGCTTGGCT 
AACTAGACGG 
CAATGGTTGC 
CTAGTTTCTT 
TTAACAATGC 
TTCCAATCTG 
GTGGCGACAA 
TCTTCAAACC 
TAGTTTGCAC 
AGCAAGCAAT 
TATAGAAAAC 
GTATTTGGGG 
TCTTTACCAG 
CCCAGGAGAC 
AGGTTGTCCA 
GAACAGCTAG 
AGGACTTGCA 
TGACCCGCAT 
CCCAAGCAAT 
TTCTCGGTGC 
TTCTCCTTTA 



TAAGATGAAT 
CCTGATTGGT 
TAAAGCAGGT 
CTTTAAAGGC 
CGCCTTAATT 
CCGTTACAAC 
GCCAACTATG 
CCACAACTGG 
CATGAAAGGC 
TGCAGGACAC 
CGTACAAGCT 
GCTTCGTGAG 
GAAAAACTTG 
TATTCTCTTC 
GGGATAATTT 
ACGTCAGCTT 
TTTGATTTTC 
TTTTCTCCTA 
ACCTTTTAGA 
GGTTCGTAAG 
TATCTTCCTT 
CTATCCGCTA 
GGATCTCTCT 
TCAAGCCCCT 
ACTGCATTTC 
CTCTTACCGA 
TTCTAAAGAC 
GAGCTTCCTC 
TATCACCAAA 



341 
AACTCAATTA 
CTATCAATTG 
AACGTCTCAG 
CTCTTCACTG 
ACCATGGCTG 
TTCTTGGCTC 
GCCGCTTTGA 
TTCCTCATCT 
TACTTCGATA 
TTCCGCCGCT 
CTCTGGGCCT 
AAAGAATACT 
AAGATTGCCT 
TTCTTCCTAC 
ATCCCCGCCA 
TATCTCCAAC 
ATTGATTATT 
GACTTGAAAT 
AAGATACCTA 
CCCCTGTCCA 
ATCAGCTTGT 
GAAACTTTTA 
GAACATGCTA 
TCTGTTGTGA 
GATACAAATG 
GCCATTCAGA 
TGGTACCAAC 
TTTGGTTTGA 
AGATCACGCC 



AACTCAAACG 
TAATTATCTA 
CCTTTAAACT 
AAACCTTGTA 
TTCAAACAAG 
GTAAACAAAG 
CAGCCTTCTT 
TCCTCTACGT 
CAGTGCCAAT 
TCTGGCAAAT 
TCATGGGACC 
TTACTGTTGC 
ACTTCTCAGC 
AAAAGAACTT 
CCCTTTTTCA 
CTCAAAGTTG 
AGCAATTGTC 
AAAGCGCATT 
TGCTTCCATA 
AACGTTTCGA 
CTATGGTACC 
TCGATAATGT 
CAATTGTCGA 
TTGGTCCAAG 
AGCTAGTCAT 
CTGAGAGTTT 
AAAATCGTGT 
ATTTCTTTAT 
TCTTTTCATT 



TAGACTGACT 
TCCACTGTTG 
AGATACTAAT 
CGGTACTTGG 
TATCATCGTA 
TTTGGTCTTC 
CGTTATGGCG 
TGGTGGTGGT 
GTCTTTAGAC 
TGTTCTACCA 
TTTCGGGGAC 
CGTAGGTCTC 
AGGTGCTATC 
TGTTTCAGGA 
TTTTATACTC 
TGCTTTGAGC 
ACTGTAAATA 
TCTCTATATA 
TCCATTTTCC 
GCTCAACTGG 
CATTGCTATC 
CTATGAACCT 
TGGCACATTA 
TCAAATCAAG 
CAGCAAGGAA 
CAAAAGCAAA 
CTATATCAGC 
CGTCTCTCTT 
TAATACCTTT 
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AAAGAGTGCT 
TTGGGATTAT 
CTGTATCTGG 
GAGATTTTTA 
ACCGTAACCC 
CGCAAAGCTA 
AGCTATACTC 
CCTTTCTTTC 
ATTCAGATAG 
TACGGCAAGC 
AAACTCGTCG 
ATCCCACTTG 
AAAAAAGGCT 
GACCGTTTAA 
CGCATCTACT 
TTCAAGCACG 
GTTTGTAACT 
AATCCCAAGC 
TCCCTTGAAA 
CGTCAATTGA 
AGCAAAAACG 
AGATTTACTT 
AAAATCAAAG 
GTAGATGAAA 
GAAGAGATTT 
ATGAACATTT 
AAATTCATTA 
ATAGGTTCCT 
TTCAATTACC 
AAAGACTGCG 



ACCATTTTAT 
TTGGCCAAAA 
TCACTATCTT 
TGCCCGTTAC 
GTGTTATTCA 
TGAAGGAACT 
AGGTTATCGG 
CATCGGTTCT 
CAACAGGGAA 
GTGTAGATGG 
CAGAAGAACA 
TCGACAACGA 
GCAAACGCAT 
CAGGCTATGA 
TTGCCGACGA 
ATCCACAAAT 
ATATTGCCAA 
TCAACTTGGC 
CTATTCTCCA 
TCGCCCACAA 
CATACTATCA 
CCTTTTCTAC 
TGCAAACTAG 
CTGACGAAGT 
TCGAAGAGTA 
TCCGAGACAG 
AACTCTGCAC 
TCTGCATCGT 
TTCTCTGGAT 
TCTGTCACAC 



CTTGAACTGT 
TATGACAACC 
TTATAAAACA 
GATTAAAGAC 
AAATAAATCA 
CAACTACCAC 
ATTAGTTCTT 
ACGTGGCATC 
AGATGAGAAG 
GCTAATTTTT 
GTTCCCCTTC 
CAATGTTCAA 
TGCCTTTATC 
ACAGGCGCTT 
GTTTCTGGAA 
TGATGCTATC 
ACACCAGCTG 
AGCCTATGTC 
GATTATTAAT 
AATTATCGAA 
GGTATTGAAA 
TGAAATTGAG 
GAAGCTAGCC 
CAGTAACCAT 
TTAATCACTA 
AGACAAAGGA 
GTGTAATGAC 
GGATCATGGT 
GATTTGTCAC 
GGCTAGAGAC 



342 
TTAGGATTGC 



CTGATTACTG 
CATTTCCGTG 
GTGGCCAAGG 
ACCATTAGCG 
CCAAACCTCA 
CCTGATGACT 
TCTCAAGTCG 
GAGCGTCTCA 
CTCTATGCCC 
CTTATCTTAG 
GCTGGTTTTG 
GGAGGAAGTA 
AAACATTACA 
GAAAAGGGCT 
ATCACAACCG 
GATGTCCCTG 
GATATCAATA 
GATAATAAAA 
AAATAAGAGA 
AAACTTGATA 
TCTTTTCCCA 
GCAGGTTGCT 
ACCTACGGCA 
ATTATCTATC 
GCTTGGATCC 
AGTGATTAAA 
TGCTCCGCGG 
AATCATGGCC 
AAAGATGGTA 



CGACTCTGAT 
TACAAAATAT 
ATCCAAATTA 
CTGCTGGTGT 
ACGAAACAAA 
ACGCTCGTAG 
CAGACGCCTT 
CATCTGAAAA 
ACGCTATTTC 
AAGAAGAAGA 
GTAAATCTCT 
ATGCGACTGA 
AAAAGCTCTT 
AACTTACCAC 
ATAAATTTAG 
ATAGCCTCCT 
TTCTCAGCTT 
GTTTAGAGCT 
ACAATAAACA 
CTGGGCAAAA 
CTATGCGTTT 
AGATCTTTTT 
CAAAACACTG 
AGGTGAAGCT 
TCAACAAATC 
ACTTGTGTCA 
ACTGCCTTTC 
TGCAATTTTT 
TGCATACGCT 
ATCATAGAAT 



TACACTTATT 
TCTTTTTGTT 
CCATAAATAG 
TTCGCCTTCA 
AAAACGTGTT 
CTTGGTAAGC 
CTACCAGAAT 
CCACTATGCC 
ACAAATGGTC 
CCCTCTCGTA 
ATCTCCTTTC 
ATATTTCATC 
CGTGACCAAA 
TGACAACAAT 
CAAGCGATTA 
AGCTGAAGGT 
TGACTCGGTT 
TGGTCGTGTT 
AATTTGTTAC 
AGTCGTTAAA 
TATTGTGGGA 
ATACTCAATG 
TTTTGAGGTT 
GACGTGGTTT 
TTCCTAGAAT 
TAATCTGTTT 
TCTCGTGATT 
TATGGATTTT 
TTTGCTTAGT 
AAAGAGCGTA 
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TTTCCAACCA AAGGTCAAAC CTGCTATCAG CATGATAGTT CCATTTACCA AGAAAGAAAT 5100 

ACTACCGACA TTCTTACCCG TTTTCTTACG AATAGTCAGG CTGACCATAT CCGTCCCACC 5160 

ACTGGAGATA TTGTTTCGAA GAGCAAAACC AATCCCCAAA CCCATAACAA CACCCCCAAA 5220 

AAGGGAATTG ATAATGGGAT CCTCTGTCAA GGTTGCCACA GGGACAAACT GGATAAAGAA 5280 

GGAACTCATA GATACCGTGA TAAAGGTAAA GACGGTGAAC TTATGGCCAA TCTGATACCA 5340 

AGCTAAGACC ATCAAAGGGA AGTTAATGGC GTAGAAGCTT AGCGAAATCG GAATATGAAA 5400 

ACCAAACCAG TGATTACTCA AGGCAGAGAT AATCTGTGCC AGACCTGTTG CACCACTCGA 5460 

ATACACATGC CCTGGTTGGA AAAAGAAATT AACTGCTACT GCTGATAAAA AACCATAGAC 5520 

CAGAGAGGCC GAAATCTTCT CATCATACTT TTCTCGAGAG ATACTTTGTA AGACACGTAA 5580 

AATTTTTATC TCATAAGCAA AGCGGCGCAG ATAATAGCGC CACCGCTTAA TTCGTTTTGT 5640 

TTGTTTCATC TTCTTCTACT TGTAAGCTGA GTTCCTCTAG TTGTTTGAGA GCGACTGTTG 5700 

ATGGAGCTTG TGTCATTGGG TCAGTTGCCT TGTTGTTCTT AGGAAAGGCA ATGACTTCAC 57 60 

GGATATTTTC TTCTCCAGCA AGCAACATGA CAAAACGGTC AAGCCCGATA GCCAAACCAC 5820 

CGTGTGGTGG GAAACCATAG TCCATGGCTT CAAGAAGGAA ACCAAACTGG TCATTGGCTT 5880 

CTTCAGTTGA GAAACCAAGA GCCTTGAACA TGCGTTCTTG AAGGTCTTTT TGGTTGATAC 5940 

GAAGGCTACC ACCACCAAGC TCATAACCGT TCAAGACGAT ATCGTAAGCA ATGGCACGAA 6000 

CCTTAGCCAA ATCACCTTCT AATTCATGAG CAGTCTCTTC CTGTGGAAGT GTGAAAGGAT 6060 

GGTGGGCGCT CATGTAGCGG CCTTCTTCTT CAGACCATTC AAACATCGGC CAGTCAACCA 6120 

CCCAAAGGAA GTTGAACTTA TCATTATCAA TCAAGCCAAG CTCTTTAGCA ATACGTCCAC 6180 

GAAGGGCACC CAGTGTTGCA TTAGCCACTT CAAGCGTATC CGCCACAAAG AGAACCAAGT 6240 

CCTTATCTTC AAGAACAAGC GCTGTTGTCA ATTCTTCTTG GATACCAGTC AAGAACTTGG 6300 

CAACTGGTCC GTTTAATTCT CCATCAACCA CCTTGACCCA AGCAAGACCT TTGGCACCAT 6360 

ACTGTTTGGC TACTTCCGTC ATCTTGTCGA TGTCTTTACG TGAATAGTTG TCCGCAGCTC 6420 

CTGTGACCAC AATCGCTTTT ACAGCAGGTG CTTCTGAAAA GACTTTAAAG TCTACACCTC 6480 

GGACCACTTC TGTCAAGTCC TGAAGCAACA TGTCAAAACG AGTATCTGGC TTGTCAGAAC 6540 

CGTAAAGAGC CATAGCATCA TCGTATTTCA TACGAGGGAA TGGTAGCGTT ACTTCGATGC 6600 

CTTTTGTTTC CTTCATCACG CGCGCGATCA AGCTTTCTGT AATATCTTGG ATTTCTTGCT 6660 

CAGTAAGGAA GGACGTTTCC AAGTCGACCT GAGTAAATTC AGGCTGGCGG TCTCCACGCA 6720 

AGTCCTCGTC ACGGAAACAT TTAACGATTT GGTAGTAACG GTCAAAACCA GCATTCATCA 6780 
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AG AG CTGTTT CGTGATTTGT GGACTTTGAG GAAGAGCGTA AAAATGCCCC TTATTAACAC 6840 

GAGACGGCAC TAAATAATCA CGCGCCCCTT CAGGCGTTGA CTTAGAAAGG AATGGTGTCT 6900 

CCACGTCGAT AAACTCCAAC TCATCCAAGT AGTTGCGGAT AGAGTGGGTC ACCTTGGCAC 6960 

GAAGTTTAAG ATTTTCCAAC ATTTCTGGAC GACGAAGGTC AAGGTAACGG TAACGCAAAC 7020 

GTGTATCGTC ATTTGCCTCA ATGCCATCCT TAATCTCAAA TGGTGTTGTC TTAGCTGTGT 708 0 

TAAGCACAAT AAGAGCTGTC ACGTTTAACT CAACCGCACC AGTTGGCAAC TTATCATTGG 7140 

CTTGTCACGC GCAGCGACCT GACCAGTCAC CTCAATAACA AATTCGCTAC GAAGGcTTTC 7200 

AGCTGTTGCC ATAACCTCTG CAGATACTTT TTCAGGGTTG ATAACCAACT GCATGATTCC 7260 

TTCACGGTCA CGAAGATCGA TAAAGATCAA ACCACCAAGG TCACGACGAC GGCCAACCCA 7320 

TCCTTTCAAG GTTATTTCTT GTCCGATGTG TTCCTCACGA ACACGACCAG CATACATACT 7380 

ACGTTTCATT ATTTCTCTCC TCTTTTATTC TGTTACTATT TTACCATAAA AGCGCAGCTC 7440 

TTCATGAAAA TCATCAGAAA AGTTTGCCAG TCTTTAAAAG TCAGGTGAAA GCCCTAAAAA 7500 

TTAGCGCTAA TACTCTTCGA AAATCTCTTC AAACCACGTC AGCGTCGCCT TACCGTATGT 7560 

ATGGTTACTG ACTTCGTCAG TTTCATCTAC AACCTCAAAA CCATGTTTTG AGCTGACTTC 7620 

GTCAGTTCTA TCCACAACCT CAAAACAGTG TTTTGAGCAA CCTGCGGCTA GCTTCCTAGT 7680 

TTGCTCTTTG ATTTTCATTG AGTATAATAC AAAAATCCGA TGAACTTCAC CGGACTCTTT 7740 

TATTTTGAAT TTTTGCCTGC TTTACGCTTT TCAGCGATTT CGGCTGCCTT TCGAGGCAAG 7800 

ACAATTTCCG TTATGTAAGC CGTCCCAAAA CGCAGTACAC CTGCAATAGG AGCAAAGACA 7860 

ACTGCTAGAT AGTTATAGAA GAAATCGCCT TTGAAGGCAT AAGCTAGCGC TCCAATGATG 7920 

AAAAATAGAA CGACTGCCTG AATCACTGCT AATAAAATTA CTCGTTTCAT GTGACCTCCT 7980 

GACTCTATTA TAGCATGAGA ATCATCAAAA AGCCGACTAA ATTATTCAAA GCGTGAAGAG 8040 

AAATACTGTA GACCAGACCT TTTCTGCTAA TGTAAGCCAA ACCCAAACTA AAACCAAGGC 8100 

TAAAATAGAC AAAAAATTGT TGCACATCAC CTGGAAAATG AATCAAGGCA AATAGAAGAC 8160 

TAGATACCAG AAGAAAAATC AGGGTTCGTT TACTATTGTC CTGCTTAGGA AAGAGATAGC 3220 

GTGCTAACAT CCCTCTAAAA ACAATCTCTT CCGTCAAAGG AGCAAAAATA ACCACAGCAA 8280 

AGAATGAGAA AAGTGGTTGA GACAAGGTCA AGTCTGTCGC TATTTGCTGA TTTACTGAAG 8340 

GATCATCTGG CAAGAAGAAT TGAACGACCA GAGATAAGAA CCAAACCAAG ACAGGAAGCC 8400 

AAATAAATCG ATTAAAGCCG CTCTTCTCAA TATGAACAGG AGCCTTCTGA TACCATTTGT 8460 

AAATGCCGTA CACATATACT CCAGCCAAGG CCACATAGAG TAGAGTAACA GCATAGGGTG 8520 

AAGCGCCTAA AGCAAGCGAC GCAGTCGCGA GCCCCTGAAT AAAGCCATAG ATAAATAAAA 8580 
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AGGATAGAAG 
TTATTTGAAA 
TCCTACAAGC 
AAAAATATAT 
TTCCAAATTA 
- TGGTGTAGCG 
TTTTTGAGCG 
TGTATGCAAT 
TGTAATATTA 
TTCATTCAGA 
TAAGAAATTG 
CCATGGCATC 
CATCCCTACA 
CGTTTCTTTT 
TCCATTCAAT 
GATTAAACCA 
CCACTACATA 
CGATTCGACT 
GTACTCATCC 
GTCGGTTTGA 
GTTACACACC 
GAATTTTTCT 
CACAACTTCC 
CATTTCTTGT 
GATTTTATTC 
TAGTATAAAT 
TTCATTGTAA 
TGTCATTTTG 
TACTCTTCTA 



345 



GGCTAGAAGA ATCCAGCCAA GGTTTTTAAG 
TAACGTTTTA CCATAGGTAA CTGCATCACA 
AAGAAAGCTA GTAACTGAAT CTCTCCTGTC 
AAGGCTGGTA AGACATATTG GTGTAATTGG 
GCCTGACGCT CCCCTTCATC ATAAGAATTT 
AAAAATTCCA AATCAAACTG ACGAACAATC 
ACTAAGAATA CCACAAAGAG TAAGAAAGAA 
ATAATCACCT CACTTAATGA AATAAAAATA 
AAAGCAATGG TTCCAAACTC AAGATTCCGA 
TCGTCATCCA TTTCCTCTTG ATACAAAGAA 
AAAGTCAAAA ACATACTAAT GAAACCTATC 
AAGGCTTTTA CATCTAAAAT AATTTCGTGG 
AACATGCCCA AGAACCCCCC AAGACAATAG 
TCATATTCAT TCTCCTTTTT CACTTGCTAG 
TACTGGGATG AGAGCAAAGT AGACCCAAAC 
GCTTAGGTCC ATCCCAATCA GTAGAAATAC 
ATAAATCACT TTATACTTGT TCATCACTCG 
GTTTCGTTGA AAATTTGAGA TATTTTCAGG 
CGTTCTAGTA GGCTAATGGT CTGTCTGGAA 
TTGAGACCAT CGCGAGCTCG AAGCTCTTTT 
TACTCTCCGT CAAATTCAAC GGTTTGGATA 
TTTCCCGTAT TATCTACACG TCGTAGCTTT 
CAGTTATCTG GCCCAATATA CACTCCCGTT 
AATAATCTCG ACATTTCTGC GTTTCCTTTC 
TCTAGTTTCT TGATTTTTTT AGAATTATTA 
CCTAGTACCC ACATTATAAC TCCTTTCTGC 
CATATCPTTT TCTTTTTGAC AAGTATAGTT 
CAAAAGAAAA AGGTCAGGAG TAGGTTCCTG 
AAATCTCTTC AAACCACGTC AGCTTCACCT 



TAATTTCATA 
TTGATATAAA 
AAGAAAGAAA 
AATAAAATTC 
ATATAGTTCA 
GCAATGGTTT 
AGGAAAAATG 
GCCAATGGAA 
TACATTTGCA 
TGAAATTTTC 
AGTAAACAAA 
GATTCGACAC 
ACATCAAAAA 
ATTTTTGGAT 
AAATTGGTCG 
GCTGACTAAT 
TCCTCCTCCA 
GCAATGATAA 
ACCCCTGCCA 
AGACGATTTT 
TCCTCAATAC 
ACCCATTCCT 
AT AATTGGTT 
TCTTTTCGCT 
GAATAAAAGA 
TTCCTATTTC 
GTCAAAAAAA 
ACCACTTTAT 
TGCCGTAGGT 



GATAACTCCT 
CATGGATGGC 
TGATAATAAG 
GAAAACTCTG 
AGACATCCTT 
TAAAAAGAGA 
TTTGAGGGTT 
TCGCTACACC 
CATAATAGGT 
TGCTTTTCTT 
TAGCTGATAT 
GTG CCTTAAA 
TAACAATCTA 
TTCTTTTCAA 
CTTTGATAGG 
AAAGCTATGA 
AACGAAATAC 
TGGATGGGGT 
GTTTGGCTAG 
TTAGTTGCAT 
GTTGCAACTT 
CATCAACATC 
CCTTTCCAAT 
CAAGTCTTTT 
AAATCATAAA 
TTAACTTGAA 
TTATGATTTT 
CTATCATTAA 
ATGGTTACTG 



8640 
8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
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ACTTCGTCAG 
TCCACAACCT 
CCATGTTTTG 
CCTGCGGCTA 
CAAAGATTTC 
TTTGGTTGTT 
GGG TCTTAGC 
AATCACGCTC 
TATTTGCCCC 
TCACACCTTG 
CAGCAGTTTC 
CGGTCAGGTC 
CCAGACCACG 
GACGCACAGC 
CATTCTCTAC 
CCTCCAAGCG 
TCAAGGCTTG 
TGACACCTTG 
CGGTAGCTGG 
GCCCTGCCTG 
TTTGCACTTC 
CTTCTGGACG 
TTACGATATC 
GCGTGCGCAC 
ACTGCCACTT 
TCATAGGGAA 
CAGTAAGAAA 
AGCTAGCAAG 
TAGTGAAAAA 
ACTAGTACGA 



TTTCATCTAC 
CAAAACCATG 
AGCTGACTTC 
GCTTCCTAGT 
TGAGAAGTTT 
CTTGACCGTC 
CGCAAAGACA 
TGCTTTGAAA 
TTCGCCCAAG 
CTTTTCAAGG 
AGGGCCTCCA 
ATTGCCCTCA 
CACCATATTG 
ATCAAAATGA 
TGCCACCTTG 
ACGTTGGCTA 
GCGGTAGGCT 
AATACCGATT 
ATTGCTAGAG 
TGGACGCTCA 
TGGGGCGAAA 
GAGGGTAATA 
CGTTGTATCT 
TTCTGCATAG 
AGCAGACTCA 
TCCTCTTTAA 
AAAATTAGGA 
GAAAGACCAA 
CAAGCTGTTC 
GCTAGAACCT 



AACCTCAAAA 
TTTTGAGCTG 
GTCAGTTCTA 
TTGCTCTTTG 
TGGCTGATTG 
ACTTGTCOGC 
TCGGCTGACT 
CCTTGTTGGC 
ACTGCGATAT 
ATGAGAAGCA 
AAGTAAGCAA 
ATCTCTGTGA 
GTATCGATGA 
GCTTGGCTTT 
TCTTCTTTTT 
TCCTTAGACA 
GCACGGCTCT 
TCCTTCAAAA 
CCAAAACACT 
TAACGGAACA 
AGTTTATTTT 
TGACGGTCAC 
CCGACAGAGC 
TTGTAGCGTT 
GCAGGTAAAA 
ACTTAATAGT 
TTTAGATATC 
CAAATAGCAT 
CCACAGGTAT 
CTGGAGCTAG 
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CCATGTTTTG 



ACTTCGTCAG 
TCCACAACCT 
ATTTTTATTG 
TCTCAAGTGA 
TTTCGACTTC 
TGAACTGAGC 
GAAGAGCCTG 
AGACATCTAG 
GGCGCTCTAC 
CCAAACCATC 
TAAACTCGAA 
TGTAATCTAC 
CTTCATCAAG 
CCTTAGAGTC 
AGGTCTCCTT 
CAGGATTTCC 
AATGGGCTGC 
CAACACCAAT 
TAGGTCCCAT 
CCACATAGGA 
CCTTGTCATA 
GACTGATAAC 
TGAAAATCTC 
TATCCTGCGT 
CTTATTTTAC 
ATTTTTGAGA 
CCAAGTCAAC 
GGATAAGGTA 
ATTTTTCATG 



AGCTGACTTC 
TTCTATCCAC 
CAAAACAGTG 
AGTATAAAAT 
CACTTGCACT 
GCTCTCTCCT 
TTTTAGTTTA 
TACCAATTCC 
GGCGTTTTCG 
ACCAAGTCCA 
GTAGCGACCA 
AATGGTGTGG 
TCCAAGATTT 
AAAGTCCAAG 
CAAGACACGA 
GAGCGGTGTC 
AAGAGTGTTG 
CATAGCGATT 
CTGGTGGAAT 
GTAGTAGAAC 
ACGGACAACG 
AAAATCGTAC 
CTCGTAATGC 
ACGGGCAAAG 
TCCTTTTGGT 
CATAAATAGA 
TTAAGAATTG 
TGTATATTCC 
AACAATAGAC 
AGCATGGCAC 



GTCAGTTCTA 
AACCTCAAAA 
TTTTGAGCAA 
CCTAGTTTTT 
TCTTCTCGGG 
AGGGTGATGA 
CGGTTGAGGT 
AAGGCCTTGA 
ATAGGGAGGG 
AAACCAAATC 
CCCGCACAGA 
TTGTAGTAGT 
TCCAACATCT 
ATAGACGGCG 
AGAGGATTTT 
AAATAGTCAA 
AGGTGCAATT 
GTTTCCACAT 
TGGCGCAAGC 
TTGCTTGGCT 
GGTGCAGTTC 
ATTTCCTTGG 
TCAAAAATAG 
CCCTCAACGT 
TTTTGTAATT 
GGGATTAAAA 
TCAAAAAAAT 
ATACGGCTAC 
CTAAAAAATT 
TAATCTTTGG 



10380 

10440 

10500 

10560 

10620 

10680 

10740 

10800 

10860 

10920 

10980 

11040 

11100 

11160 

11220 

11280 

11340 

11400 

11460 

11520 

11580 

11640 

11700 

11760 

11820 

11880 

11940 

12000 

12060 

12120 
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TTGAACTTTA CCAGACACAT ACAGAGTAAA GAAGAGAAAT 
ATTGAATAAA TTAGCCAAAC CAACTAGACT AAGTCCTACG 
AGGCAAGGAC TGCTTCCCAA AATAATCATT GCCCGTAAGG 
TAAAACACAG AATTGATTGA TAAATAGTGC CTCTGTATAA 
GCTCAAAAAG AAGATATTAT AAATTCCACC CAAAGCGCCA 
-GACAGCAAAG AGCATAAAAC CAAAGTTTTT CTGTCCACTT 
ATTTCGGTAA ATTGTTAGGA ACTGGTCTTT GATAGAAAGC 
AC CATC AG C A GATGACATTG ACAGGCTCAA TTTGCTTTTT 
TGATACTAGG AAAAAGCAGG CATTGATTCC CGCAACGAGA 
AGCTAAGAGC CAGACTCCGA AAGCTTGACC ACCAATAGCT 
TGAAAAAGAA TAAGCCTCCA TCAGATCATC TTCAGCTACT 
ACGCAGGCCA CCTGCAAAAT CACTGATGAT ATCACTAATG 
AGAAAAGGCA AAGAGACTAG CTTGCTGAAC AACTAGGGCT 
CTGAAACAAA CCGCTATAGA CCATCCATTT GACCTTGTCC 
CCCTGCAAAA ACTGTAAAGA GGGTCGGAAG AATCATGACA 
AAAAGATGCT TGTGACAAGG TCGATGCATA GACGATAAAG 
ACCAAAAGCA TTGAAGAAGC GTGG 
(2) INFORMATION FOR SEQ ID NO: 35; 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19250 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



AGCAAACCAA 
GTCTCCCACA 
CTACTGATGA 
GAAAAATTCA 
CCCAAGGAAT 
TTAAGAAAAA 
TTCTCATTTT 
CCTAAAAAGA 
GAAAAATTGT 
GAAATATAGG 
TTTTCCTTAA 
ACATTGATCA 
GCTAGAAAAA 
CTCGTGTAAT 
ATATTCGCCA 
ACCAGGTTGA 



GCACGACTTG 
TCATCAATCT 
TGACTGATAC 
AGAGAGAATG 
TAATAAGCAA 
CGAGACGTAA 
TTAAGTTTTC 
GGATAGTGGC 
TGACCGATAG 
TGATGAACTG 
TAAGAGGCAT 
AACACAGGCT 
ATAGAACCGC 
CTGCCCGAAT 
TAGCAACAGC 
AAATCGAAAC 



12180 

12240 

12300 

12360 

12420 

12480 

12540 

12600 

12660 

12720 

12780 

12840 

12900 

12960 

13020 

13080 

13104 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
CCGGGCAAAT AGTTTTGAAC TTTTCATCAT TTTCTCCTTT AAAACTTTCT 
GACTCTTTTC AGAAAGTTGT CAACAGAATT TTCAGAATTT TTGAAAATTA 
AACATCTTTG CAAAAAATAT GAATATCGTA AGCGCGTCAT AACAAGGTAT 
TGGAGCTCCT CCTGTATACT ATTAGTAAAG TAAATATTGG AGGATATTTT 
CCTATTGTTC CTGTAGAGAT TCCACAATCT CGTCGTTTTG ATTCTAAAAA 
ATTCTrCTTA AAATTCGTAT TGGCAAGCTT GAAGTAAGTT TTTTTCAATC 



CTCCATTATA 
TTTTTCAAAC 
CTATCATTCA 
AATGCCACAA 
GAGAAATGAT 
TCTCAATCTC 



60 
120 
180 
240 
300 
360 
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GAAATGATAG 
AGGGCAGGTC 
TTATCTCGTT 
TGGTGGACGT 
ATATAAACGC 
TCTCGCACCT 
GTAGATTGAA 
CTGTCCTGAT 
TGATTTCTAT 
GTCCATCTCC 
TGAAGTTGAT 
CGAATCTCTT 
GCAAAGCCAA 
ATAACGAGTA 
TTTATAACCT 
CCTATCGGGT 
AAGCAGAGAA 
TACAAGAATT 
TTAGGACTTT 
CGACGCTAAG 
CACTTGTTGG 
CAGATAAATC 
AAAGAGACTG 
AACCCCTACT 
AACTAGGAAG 
AAGACGGACA 
TGGGACGGAG 
GGGTGGTTAT 
GCAAGAGCTA 
GAATCTATAA 



AACAGCTTTT 
TATCTCGTGT 
AAAACCCACT 
AAAGACCGCT 
TTTGAGAACG 
GAACAAGTAG 
ACTAGAATAG 
CGATTTGTCC 
TGAAATGAGG 
GATTAACGAT 
TCATGACATC 
TCCACACTTG 
TATTAGTCGG 
AAAGATAATC 
GTTGCGAGAG 
TCTAGAGAGT 
ACAAGGGATT 
CCTAGGAGAT 
AGTCCTCTAG 
CTTGGTAAAC 
ATGTTGGGCG 
ATCCTTAGGA 
GGAGGCTTTG 
GGAAGACTTC 
GGCAATTGAA 
TCTGGTCCTT 
TAAAAGAGTC 
TTTTAAAAAA 
TTATTATGAG 
CAGTACGCAT 



GGATAAGGTG 
GTGGGAAAAC 
TTGAATTGGA 
TTAAAGTCCT 
GCAGACTGAC 
ATTGGCTGAT 
TACACCTCTG 
TGTTATTATT 
ACTTTCTTTT 
GGACTTTATC 
TTCCAAAGTT 
TTCAATGGGG 
AATCTTTAAG 
ATCTGGATAA 
AGACTATTGA 
GATAGCCATC 
ACGCTTTACC 
TATTCTGGCT 
TTCTGCCTAT 
TGCGAACAGC 
CATGTGAGAA 
GCTAAAGGTT 
CCAGCTGATG 
TTTGCTTGGT 
TACAGCCTCA 
TCCAATAATC 
CAGTGGACTC 
GCGAGGGTGG 
TTTGTTGGAA 
CGACTGCTAA 



348 
TTGCTCTATG 



TGATATGAGA 
TCCTTTCTCC 
TTACTGGGAT 
TTGGCCCAGT 
GAAAGGCTTT 
CTTCTAAAAC 
TCATTTTACT 
TATACTCATC 
ACCTCCTTCT 
CGAAAGGCTT 
TTCATCTCTG 
GTACTTGATT 
GCTTGTGAAA 
CTCAGCCCTT 
TGACCTACTA 
ACCATGATCA 
ATGTTCATTG 
GCGATAGCAG 
TAGAAGCTTA 
GGAAGTTTTT 
TAGCCTATTG 
AACGGCTACA 
GCCGTCGTCA 
AGTATGAAGA 
TAGCTGAACG 
TTTTAGCCTA 
TTATTTTCTC 
ACAGCTAAAC 
AATATTTCTA 



ACAATTCATC 
GAAGGAATCG 
GGTCAAATCT 
GGTCAAGGAT 
ACAGAAAAGG 
TCTATCACTC 
ATTGTTAGAA 
ATAAATCCAT 
TGCTTTCAAA 
CCAGTCCTTG 
TATTCTTAAA 
GTGTGTATGG 
TATGCCATAT 
GCTCCTATTC 
ACTTCATGCG 
TTGGACTTTT 
GTGTCGAAGT 
TGATATGTTG 
TCCAAGGTTT 
TCGTCAACTG 
TGAAGTGCCC 
TGATCAGTTA 
GAAACGTCAA 
GTCAGTTTTA 
AACCTTTAAG 
CGCCATTAAA 
AGCTCAGTTT 
AAAGTTTTGA 
GTCATCAATT 
TAAATCAATT 



TATCTAGCCT 
ATTCACTGGC 
TTCTCTTTTG 
TTTGGCTACT 
ATGTCAAAGC 
CAAAAATATA 
ATCGATTTTA 
CAGAAAGTCG 
AAGCACTCTA 
TATAACATCT 
TCCACGTTTA 
AGGAATAAAT 
AGCATTGTCC 
CTAAAGCCCC 
GATGAAACCT 
TTGTCAGGTA 
GGTTCAGTAG 
CGGCAGTAAC 
AGGAGTAAGG 
GAAGAAGCTG 
CCCAAGCAAG 
TTTTCCTTGG 
GAACATCTCC 
TCGGGTTCAA 
ACCATTTTAA 
TCATTGGTTA 
AAAAAAACGA 
AGGAGCTAAA 
ATAGTGCGTT 
TTCCTTTCCT 



420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 



WO 98/18931 



PCT7US97/19588 



349 

AATCGATTTG TTCATATCTT ATTACAATCC ATTATAAATA GCGAGAAATA TCTATCCTAT 2220 

CTTCTAGAAT GTCTTCCAAA CGAGGAAACT CTCGTAAACA AAGAGGTTTT AGAGGCCTAT 2280 

TTACCGTGGA CTAAAGTTGT ACAAGAAAAG TGCAAATAAG AAATCTCCAG ATTAGGAACT 2340 

ATATATGAGT TCTCTAGTCT GGAGATTTTT CAATAGACTT CGTTATTGGG CGGTTACTTT 2400 

CGAAACTTTG AAAACTTCAA AAAACGGATT TTTATCGCTC TGAACATCAA AAAAGAAAGG 2460 

ACGAAATTTG TCCTTTCTCA AGCTTAGCTT TTCTTCAACC CACTACAGTT GACAAAGAGC 2520 

CCTTTATTCT ATCAAACATG AAGCGCAAAA ACAAGCCAAA AATCCGATAG AATGGCTATC 2580 

CCTCGACTAT CAAGTAAGAC ATTTCCATCA AATACGTTCA ATTTTACTCT TGTTCTACTA 2640 

AGAATTAATC ATCTCGTTTT GATTTATTAA AAATATACAA TTCAGCTTTT CCTCCAAACT 2700 

ATTTTATCCA CTATCCCTGT ATAGCTCTGT ATTATCTTAA . CAACTTTAGT AGAGACATTT 2760 

TCCTCAACAT AATCCGGAAC CGGTAATCCA AAATCCTCAT CTTGTGCCAA GCTAACAGCA 2820 

GTTTCAACTG CTTGAAGAAG AGAATTTTCA TCAATGCCTG CCAAAATAAA TCCTGCCTTA 2880 

TCTAAGGACT CAGGACGTTC TGTACTTGTA CGAATACATA CAGCGGG AAA AGGATAACCT 2940 

TGACTAGTAA AGAAACTACT TTCTTCCGGT AAAGTTCCCG AATCAGATAC TACAACAAAT 3000 

GCATTCATCT GTAAACAATT ATAGTCATGG AATCCTAGTG GCTCATGCTG AATCACACGT 3060 

TTATCTAGTT TAAAACCGCT CTCTTGTAGC CTTTTCTTTG ATCTAGGATG GCAAGAATAT 3120 

AAGATTGGCA TATTATACTT TTCAGCTAAT TGATTAATTG CTGTAAAGAG AGAAATAAAA 3180 

TTTTTATCTG TATCAATATT TTCCTCACGG TGAGCTGAAA GTAAGATATA ACCTCCTTTT 3240 

TTCAATCCCA AACGTTCATG GATATCTGAA GACTCAATAG CAGATAAATT TTTATGTAAC 3300 

ACTTCTGCCA TAGGAGAACC AGTTACATAT GTGCGCTCTT TAGGTAAACC ACACTCATGT 3360 

AAATACTTAC GTGCATGTTC AGAGTATGCT AAGTTAACAT CTGAAATAAC ATCAACAATC 3420 

CGACGATTAG TCTCTTCCGG TAGGCACTCA TCTTTACAGC GATTGCCAGC CTCCATATGA 3480 

AAAATTGGAA TATGTAAACG CTTGGCAGCA ATAGCTGATA AACAAGAATT TGTATCCCCT 3540 

AAAATCAATA AAGCATCTGG TTTAATTTGA TTCATCAATT TGTATGAAGT ATTAATAATA 3600 

TTCCCTACAG TAGCACCAAG ATCATCTCCA ACAGCATCCA TGTATACGTC CGGAGTGTCT 3660 

AACCCTAAAT TATCAAAGAA AATACCATTT AAATTGTAAT CATAGTTTTG TCCAGTATGT 3720 

GCCAAAATAA CATCAAAATA CTTTCGACAT TTAGTGATAA CACTACTTAG ACGTATAATC 3780 

TCTGGACGTG TTCCCACAAT AATCAATAAC TTAAGTTTGC CATTATCTTT AAAGTGAATA 3840 

TCACTATAAT CTGTCTTAAT TTTCATTTAT TTCTCCACTT GTTCAAAAAA AGTATCTGGA 3900 
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3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
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GGAACGACAG 
TGCTCAGGAT 
GTTCCCATAG 
ACACCAGCTT 
GCTTCTACAG 
- TAATCCACAC 
AAACGGATTT 
TGTTTCTTTT 
TTGAGAACCG 
AATTGTGACA 
CCATCTCCAT 
AATTCTTTAA 
GCTTCAATTC 
CTTGGAGCCT 
AAATTGTGAA 
AGTTCTTCCT 
ACATCTGAAT 
CCAAGATTTT 
AACTCAGGAT 
CCTGTCACAA 
TGTGTTGGTG 
GGATATGGTG 
TGTAAATAAA 
ACCAAATCAG 
ACATCAAATA 
AATGTGTCCA 
TCAATATTCT 
CGAGTTCCAA 
ACGGTTCTTA 



ATCCACGGCT 
TTACCGTCCT 
CATTGACAGG 
CGATAGCCGC 
GGAAAAATTC 
CATGCATAGC 
TCCCAGCCAC 
CATCTCGCGA 
CATTCCCAAA 
TATATTACAC 
ATGGATTTGA 
AATGCCTATA 
CCTCTGGACG 
CTTCCTGAAT 
AATCTAATAC 
CAGCAATTTG 
ATTCTTCAAT 
CACGACGATG 
GCGTATAGTC 
ATATGCTCTC 
TAAAATGATA 
AATAGATATC 
AGGCCGCCAG 
GTTTTTCTGA 
AAGTTTGTTT 
AGACCTGATC 
TACGTGTTCT 
ATACTACAAC 
AAATAAATTA 



ACACAGAACA 
GGACTTAGCA 
ATAAGCCGCC 
AGTGAGGACA 
ACAAGAAGGT 
ATTTTTTACC 
TTCTGGTACT 
AAATATACGA 
TGAACCTGTC 
TTCTCCTTCT 
AGCTTGACTC 
AATATTATTT 
TTCAGTTGTA 
ACCACCACTA 
TTCTAAAGGT 
GCGAACACGA 
AATCCTTCTA 
AGCTGTAATT 
CTCTTGAATT 
TGGAGTTTTT 
CTGAGCCAAA 
GTAAGTGCGC 
TGAACTAGCG 
CTCTAAAATA 
ATCTTTCATA 
CAACATTTGA 
TAACTCTTTG 
TACTTTTTTC 
GATAACGGCT 



351 
TTCCCATAGC 
ACAGCAATCT 
TTATCTGTAG 
TTCTCCGTTC 
ACTTGTTTAA 
GAAGCTAAGT 
TTTACCTGAA 
ATCTCTGAGA 
CCTCCTGTAA 
AGTATGTCTG 
ATTGCTTGAT 
TCATCAGCAC 
TCTCTCATAA 
TCTGTTAAAA 
TCGATCATCT 
GGATTCATAT 
ATTGCTCTAA 
AGAATAAACC 
GTAGTTTGTA 
CCTTCTCTTA 
ACCCCAACTG 
AAACCAGCTT 
AAGGTCGTAC 
GCCTTCATTC 
ATAGACAAAT 
CGGTGTTGGC 
ACCAAAGGAC 
ATATATTTAC 
AATCCATAAC 



GAGTCACACA 
TTTCCATCAT 
AAAGACAGAT 
CCAAAATGTT 
GAGCAGCAGC 
CACGCACATC 
ACTCATGACG 
CATCTGTTTC 
TTAGGAGAGT 
CAATTTTCTT 
AAACTGAATC 
CTACAAGTTT 
CCAAAACAGG 
TTAAATAACT 
TGATACGTTC 
GGATAGGATA 
ACATATGTCT 
TGCTTTCTCC 
AAGCATCAAT 
AAAGATTATC 
CTTGACGATT 
CAACATGACC 
TTGTATCCCC 
CTTCCAAAAT 
CAAAATCGGG 
CCGTAACGCA 
ACATCTTGAT 
TTACTCCTAA 
ACCACCTCAG 



TATCTTTGTA 
AGCCTTGGAT 
AACTTGCTTT 
AGTTTTTACC 
GTGAAAAACA 
TCCAAGGTAA 
CATATCATCT 
TAAAAAACGC 
TTTTCCTGTA 
ACAAGCCGTT 
ATTTTCTAAT 
CAAAGTCCCT 
TTTTCCTAAA 
TCTTGATAAA 
ACAGCCACTT 
AATAGCCTTG 
CATCGGTTCA 
TATCCATTCT 
CGCCGTATTA 
TTTTGAAAGT 
AAACTCTTCA 
AATTGGAATC 
ATGAACTAAC 
GCCAATGGTC 
AATAATCCCA 
AACTAATGTT 
GGCTTCTGGA 
CAAATAATGA 
ACATACTTGA 



5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 
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ACAAATAGCT AATGTTACTA AACTAAAATT ATCAGACAAG ATAAATATTC CTAATCCCAA 7500 

AGTTTGGACA ATCGAAGCTA ATATAGTTGT CATTGTAGTT TCTTTCACTT TATCAATAGC 7560 

TCCTAAGACA GGCCATCCGT AAATCATAGA ATAAAAACTA GCAACAAAAG CGGGTAATAA 7620 

GTACTTAAGA AAATCTGCTG AAACGGTATA TTTTTCACCA CCAATTATAG AAAGAATTTG 7680 

ATTTGAAAAG AATAAAACTA TCAAAACTCC AAAGATAATA GGAATAAACA TAATCCGATT 7740 

AATACTCTTA ACCGATTGTA TATCTTTAGT ACGTATCATA TGCGGATATA AACTATTCGC 7800 

TATAGGATTA TACAATGATT TTGCTGCTGA AAGCAGTTGC ATTGCTATCC CCCAAAAGGC 7860 

TATCTCTTGA CTTTGTAAAT AAAAACCCGA AATGACTGTC GTAAAGACGC CAAAAATAGT 7920 

AGTTGCAAAA TTGGATAAAA AATAAATAGA GGATTCCTTT AAATCTTTAA CCCAAACAGA 7980 

CAGATAAGAA AATGATAATT TAATTCCATA ATAATGAAGG AATCTATAAG AAACTACTGC 8040 

AGCAACTAAA TTCCCAATTC CTTCCAATAT AGGAATCCAT AAAATAGAAG AATCATCTTT 8100 

TACTACAATA AATGTCAAAA TTGTAATGAT AGTTTTAGAA ATAATATAAG GAATTGCAAC 8160 

TGCATGCATC TTTTCAATTC CACGAAATAA AAAGTCAAAG ATAAAAATAT TGGTCACTGT 8220 

AGCTAACAAA TAAAAAACTG AAAAAAGAAT ATTCTCTCTC ATTATTGGGA TTTGCCACAT 8280 

CAATATGGTG TAAATTAGAA TCGAAATGAT AGATAAAAAT ATTTTTTCAA CTAGAGTATC 8340 

TCCAACTATC CTTCCAATCT TTGAGGGAGT AGTACAAGCA TTTACAATAT TTTTTGTAGC 8400 

TGATATCATG AAACCAAAAT CAATCACCAG TTGAACATAA GCTATTAACO CTTTAACATA 8460 

AATAACCATT CCATACGCGT CTAGCGAAAG CACCCTTGTC AAATACGGGA GTGTTAATAA 8520 

AGGAAATAGT AATTTAACAA TATTCAGAAT ATAGAGAGAA CTTGTATTTT TTATAAATGA 8580 

AATTCTATCA ACTTTCACGA ACTAGTCCTT CCAAAAAAAG ATCTAAATAG TCCAAACTAC 8640 

TTCTCGCTTT CAACACCAAT TCTGAAGGTA TTGTTATCGG TTTTAGATGA AAAGTTTCAA 8700 

GTTTCTTTAC AATACTATTA ACACTTGAAT CAAATAAAGA TTCACAACGT TGTAACTCTC 8760 

CAATTGCTCC ATAATAACGT GCTGTTTTTT CTGGATGGCA TGCAATGGCA ATCACAGAT V 8820 

TATTAAAACA TGTTGCCACT ACCCCAACAT GTAATTTACA AGTTAAAACC ACATCTACCA 8880 

TTTTCAACAA TGATGTCATT TCTGCAGGAG AATGATACTT GAATTGAAAA CAATCCTCAG 8940 

TTCTAACTAA TTTTCTAAAT TCCTGATAAT AAGCATCTTC ATAAGGTAGA ATGGAATCCG 9000 

AAGTTACTAC AACATAATAG TTAGGATTGT TTTCTAGAAA AAGACTAATT GATTCCGCAA 9060 

ATTTTTCAAG AGCTTTTTTG GAATGATTAT AGTGAACAAG AATTATCTTC TTATCTTTAG 9120 

CTTCTCTTTT CAATTGACAC AGCTGCTCTG TTTTTTCTTC TCTTAATTTA CTTGAAATAA 9180 

TTAAATCAAA GGTTTCATGC ACTGGAGCCG AAGGCGACAA ATGCTTCAAA GAATCAAATG 9240 
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ATTCTCGATC ACGAACTGTA ATAAATTGAG CATGATTAAT AATTCTCTTT ATACCATAAT 9300 

TCATCAAAGA ATCGTTATTA GGCCCTGCAC CAATACCTAA TACTCCTATA GGCTTTTTAA 9360 

AATATGAAGC CCAAATTCCC AAAGGTAAAA ATCGTTTAAA TTGGATTAAA TTATCACGAA 9420 

AACGTGCATT ATGCCCTTCC CCAAAATATC CTCCCGGGAT ATACAAAATA GCATCTGCTT 9480 

GTTTTTTAGT AAAACTTTGT TTTTGGCGAT ATTCTTTCAA GTACATTTGA AAGAAATCTG 9540 

ATGGATTATA AAAAGAAACT TCATATCCTT TAGATTCTAA TAAATCATAG ACAATCTCAC 9600 

CGTAAAGATA ATCACCGTAA TTACTTGAAC CATAATCCGT TGCACCATGT AACATAATTT 9660 

TTTTCACCAC TATTTTTTCA ACCTCCTAAA AATAAATATC ATAATCAAAC TATACATAAT 9720 

AGGACGATAA ACATCTATTG AACTACTTCT CACTAAAAGC AATAGTTGAG AAATTACCGA 9780 

AAAATAAATA ACTTTTGAGA TTTTACTTGT TTGAAAAGCT CTGAAATTTA ATCGCCATCC 9840 

ACTAAATATT CCCAAAACAA AACTCCAAAA AACACCACCA TAGTAACCAA AGTTCCAAAA 9900 

TAATTCTTCC ACAAAAGAAG AGCCTACAGG TAACCCCAAA AATTTATTAA TAACAACCGT 9960 

CGCTGATGCT TTATCAAAAA AATCACCAAC TAACCATCCA ATAGGAAAAA TTGATAGGAT 1002 0 

AGTGCGTAGA AATGTCATCC CATATTCATA TGGAATGCTA CTAGGCACAA CAGTTACAGC 10080 

AGAAGCTACT GTTAGGCTGG TCAGTCCCGA CTCTGAAAAT ACTTCCCCTA GTATATTCTT 10140 

TACAAAATCT AATGAAGAAA AGGAATCAAA TAAGTATATA CCTATAGTAT TCAAGTCGAA 10200 

ACGGTGCCCC CTAATAACAA CTAATACATT TAATAGAAAT ACAGTTACTA TTAAAAATAC 10260 

AAGTACTCTT TTCTTCGAAA AAGTAATCCC TAAAGATTGT GTGTATACTA AAACCAACGC 10320 

CAAGATTGAA AACACCTGGA TTTTACGACT TCCTGTTAGG ATCATTATCA AAATTAGGTA 10380 

AAACAACATT ACCCAAAAAA TAGTACGCTT TATAACTCGG GACAGCTTAT CTGAATAAAA 10440 
CAAGGAGAAC ACACCAGGAA GCATAAGTAC TCCTAAATCA TCTATTATTC CTGAACTAGC . 10500 

TGCCTCTGAA TATGCTGAAT AGCTATTCGC CGCTCTAACT GCTAGTACTG TTTTAGAATC 10560 

AGTTATTACC CTAGAAATAA AGCCCACTCC TGTTAAAATC CTACCCGCAT TGTACAAAAT 10620 

TTTCTCTTCA TTTTCCTGAT AATTTTGTAC TTCTGAATGA TAATGTACCT TTCCATCACT 10680 

ATAAAAAAAT AAATAGCCTA CAGAATAACA AAACAAAATC CAAATTATAA AAATATATGA 10740 

ATGAAATAAT TCTTCATTAT TATAGAAGTT ACTAGGGCTC CACAGCAGAG TTGTTTGAAA 10800 

CCCCATATAC TCATTGAAAA TTAATCCAAA CATAAAAAAA TAAGATAAAA TCAGATACCA 10860 

TACAGAAAAA TCATATATAC TAACTTTTTG TAAAATAAAA CCAGTAATTT GAAAAATAAT 10920 

TAGAAAGCAA ACCCATATAA ATATAGACGG AACATAATTA GATATAAGAA AACCATTATT 10980 
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CCAATTATCG 
TGTCACTCTA 
ATAACGATTC 
GTTATTTATT 
TAGCAACACA 
TTGCGTCACT 
AACAGGCAAC 
AGAAATATCA 
TTTCTGTTTT 
TTTGATTAAA 
TAGGCGAGCT 
TTCTCTAACA 
AATTTTTCCG 
TGCAAACCAA 
TTGAAAACTG 
AATTATTTTA 
ATAATCTCCT 
TAATAGAGGC 
TTGTTCTTTT 
ATAGTTGAAT 
TTTAATCATT 
GTTTTTCTAA 
CAAGGTTTTG 
CGAAACCAAA 
TAGGTTTATA 
GAGTGTGATT 
CTTTAGGCTG 
CCCTTAATTC 
AAAAATCGAT 
GAATTGGAAA 



AGAGTCCAGA 
CAAATATACT 
AATAATTTAC 
TCAAAACGAT 
GACTCTTCGT 
GTATCTGACG 
CCCTCATATT 
GTCCTTCTCC 
AATTTCTGCT 
ATGAGTTCTT 
ATATTTCCTA 
TCTGACAAAA 
TCTTTATACG 
TGAGTTGCTA 
TTTTCTGTTA 
GATAAGATCA 
TTCTTTATTA 
ACATGATAAA 
CCAGGCACAA 
AGAAAACTTT 
CTTCTTCCTT 
ACTAATTCTG 
AATATACAAA 
ACCATTCTCT 
CAACAATGCA 
CCCGTATAAA 
GAATGTGTCC 
TCCTGCATTA 
TATTTTTTTA 
CGTACTATTA 



ACAAGTAACA 
TTGTCTGCAT 
TAGCTTGATA 
TGCATTCCTC 
TGATAGGTAA 
ATAAAATTTG 
TAGACGGAAG 
CTAAAAATAG 
CATCCTCACC 
TTAAAACGTT 
ATACGAACTT 
ATTGATACTT 
CTTTCTCTCC 
AGATTTTTAC 
CATAAGCCAT 
GACCAATTGC 
TTCTAGCAAG 
CCTTTGCACC 
TAAAATCAAA 
CTACTCCACC 
AAGCTTAAGA 
TCCATGAAGT 
GCCAAACAAT 
ATTGACACTT 
GCAAAGTAGA 
TTCAAAACAA 
ACCAAGTTAA 
GTACCTATAA 
TTTTGTTCTT 
TTTTTTAACT 
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GAAAGCAAAT 



CTATATCTCC 
ACAAATATCA 
AGATGTTAAA 
GTAACTAATG 
TAATCCCGAT 
CAAAAAAACA 
CACATATGGG 
ATTACCAACT 
AAATAAATAA 
ATTTGACACA 
TTTCAAATCA 
ATATAACCAC 
CAAAATTGTT 
ATGACTATGA 
AGATTTATAG 
AGAGAGAAAC 
CAATTCTTTC 
TTGAATTTTT 
ACTATCTAGT 
TTCGCTTCTC 
TATCACAATT 
CTTTTTCCGA 
TTTCCATATA 
GTTTATTAGA 
CATCTGTATT 
CATTGCTGAT 
AATTCAACTG 
GAAAACGAAT 
GCTTTACCTC 



ATAAAACTTA 
TTTATTACAC 
TAGAGTCCAT 
GACAGTACTT 
TTTTTGGTCA 
GCCTGAGCCT 
TCCATCGCAG 
GTCAGATTTA 
AGGAGTAAAA 
CTTTGGTTTT 
TCTAATTCTC 
ATTGCATTAA 
TTAGCCGAAT 
ACTAATTTAC 
ATAATTCTAA 
CCATGGCAAT 
TGATGTAGAG 
ATTTTATCCT 
TTTCTATCAA 
GTTGTAAATA 
TAATTCTATT 
CTTAATTAGC 
TTCATCCTTC 
AGTATCTTCA 
CAAAGCATAG 
CTTATAAAAA 
ATTTTTTTCT 
AAATCGACTG 
TAAACCAATG 
GTTTAATTCT 



ATGTCACTAG 
ACATTTCTTG 
CTGTCATACT 
TATCTTTCCA 
CATCTACTTC 
CTACTAGAGA 
ATAATAAATC 
GTTCTAAAGC 
TAACATTTGG 
TTTGATCTGA 
TACGACATTT 
AAATAATTTC 
CTTCCCCACA 
GCAATACTTT 
TTTTACAACC 
GAACTATATC 
GCTTTTTCCT 
CTAAAAATCC 
TGTGAGAATA 
GATGTAATAC 
TCTGTTTTTT 
TGTTTCCTGT 
ATAGGTAAAA 
CAAACTAAAA 
TCTAGTAAGG 
GACATGGTAT 
TGACAAAATT 
TCATTTGCAA 
TAGGAAAGTT 
ATCATATTGG 



11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
12300 
12360 
12420 
12480 
12540 
12600 
12660 
12720 
12780 
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GTAGGTTATG 
AAACGATATT 
ATTTTTCATA 
GAAAATCTAC 
TTTCTTTTAG 
ATTTTCCTTT 
TATTCTTATC 
AAAAATTGAA 
AATTATTCTT 
TTTGAAAAGT 
ATAGATAACA 
CTAACATATC 
ATAATTGGTT 
ATCGAGTCTC 
TCCTCCAAAG 
GCTTTCATGC 
AAATTGGTAT 
TTGCCAATAT 

GGGATAAAAA 
AATTGCTTCA 
CGGTAAACTT 
CCACCTACGG 
TATTTTTTTT 
AATACACAGT 
GTAGAACTAA 
GGGATTGTAT 
TGACGATAAT 
ACTTCATGCC 
TTATAATGTT 



GGTAGTAAAA 
CATTAAAGAA 
ACTGTAATCA 
TAAAATGAAA 
CTTCTTTTTA 
ACCAGAAAAA 
CCAATATATA 
GTAAGGAGTT 
TCTTACTTTC 
GATTTTCATA 
TACTAAATTT 
CACAATTTGC 
TGCCTGCCGC 
CTATTAAAGA 
AACGTCTTCC 
TTAACAATTC 
TCTTCTCTAT 
TACCAGCAAA 
GATCTTCTGC 
CAAAATAATT 
TTTTTGAGAT 
TTAAACTATC 
TATAAGCCAT 
CAAAATTCGA 
TTGCAAAGCT 
AAGAACGATA 
CTGCATATAT 
CTTTTCGAAC 
GGCAAACAAA 



TACTCTCCCA 
TTTTTCACCA 
CGAATATCAT 
GACACAATAC 
ATTTCTTTTC 
GAAATACGAT 
ACATCGTAAC 
AGATATATAT 
CCTCTCTAAA 
GTAACAACGA 
ACAAATATTT 
TTCTTCTACA 
CATATAAGAk 
AACTAACATA 
ATAGAAGGAA 
CGTACCATCT 
CAAACTGGCA 
AGTTAGGTCA 
ATATTGTGGC 
TTTAAATGAT 
AAATTTAAAC 
TGGCCAAACA 
ACCAGCCCAT 
TCCATCTTTC 
AAAATAATTC 
TATCGTAACA 
CTTCCCTTCA 
TAAATCTTCA 
TAGTATTTTC 
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TTGGTAAAAA 
ATTGTTTCTG 
AAATATATCT 
TATGTAACGG 
TGAATTTTAC 
AGTAGTTTTG 
TAATAGACAG 
TATCAGATAG 
CATGTCTCCA 
GCTTTCTTTC 
TTTGCCAATT 
ATTATTTTAG 
TGTACCTTCC 
GCATCTGATT 
ATATTCTTTA 
CCAACAAAAT 
GCTTTCAAAA 
ACACTTTCTT 
AAATATGTAA 
GGACTAGTGA 
AGCTTGAAAA 
TCCATACAAT 
GCCATCATAA 
GTTTTATACC 
AACAATCGAA 
CCTTCTATAA 
GGGTAATTAG 
CAAATATCTG 
ATTGTCCAAT 



AAATTTATAG 
AACCAAACGA 
ATTTTTAAAT 
CAATATCATA 
ATAACCTAAT 
TTTTGTAATA 
TTTCTTCAAT 
TATAAACAGT 
GTTCGAGCAT 
CTAACTCTCT 
GTTTTACATC 
CATCTCCTGA 
CAGGTATAGT 
TTTTATAGAA 
ACTCCAATTC 
GAAAATGAAT 
TAGTTTCCAA 
TATTAACTAT 
TCTTTTGTTC 
CAAATATATA 
TCAAGCCATC 
ATAGAAACAT 
CTGGAGACAA 
TCCCCAATAA 
ATACAACACT 
TCTCACGTCT 
GAATCCCAGC 
ACAACCTGAA 
TTAACTTTCT 



CCGTCTGAAG 
TAAACCAAAA 
GAAAAGAGAA 
TCATAATCAT 
ATCTTACTTA 
ATCTCGTTAA 
AATTCTTTAT 
ACTCTCATTA 
AAACTGCTCT 
TTGTCTCTTA 
TCGTTCGGGA 
AATTGCACCT 
ACGAGAAACT 
GGATGGCATT 
ATGAGCTAAT 
TTTCTTGGGT 
ATTTTGTGCT 
AGATTCATCA 
GGATATGTCA 
ATCACTAGCT 
TTGTTTCACT 
CGGTTTCTTA 
TTGGTTAACG 
AACTCCTAAA 
TTTTTTTCTA 
TTTTTTATTA 
CAAAACAGAG 
T«GGTTCTGGC 
TTCTTACCAC 



12840 

12900 

12960 

13020 

13080 

13140 

13200 

13260 

13320 

13380 

13440 

13500 

13560 

13620 

13680 

13740 

13800 

13860 

13920 

13980 

14040 

14100 

14160 

14220 

14280 

14340 

14400 

14460 

14520 
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TACCCTCTAC 


AATACCTTTT 


CGTTTCAGTA 


CGTAAGGTAT 


TGTCTTAACT 


ATACATCTAA 


14580 


TATCCATTAT 


CAAAGACAGA 


TGTTTAACAT 


AGTAGCCATC 


TAACTCCGTC 


TTCATCTCAA 


14640 


CAGACAAAGT 


ATCACGCCCG 


TTAATTTGTG 


CCCATCCAGT 


TAACCCTGGC 


AAGATATCAT 


14700 


TTGCTCCATA 


CTTATCTCTC 


TCTGCAATCA 


AATCTAGTTC 


ATTTATACCC 


GCTGGTCTAG 


14760 


GACCTACAAT 


ACTCATATTA 


CCAACAAGAA 


TATTAAACAA 


TTGTGGTAGT 


TCATCCAAAG 


14B20 


ATGTTTTTCG 


CAAGAAAGCC 


CCTACTTTTG 


TAATCyATTG 


CTCTGGATTA 


TATAAGTTTC 


14880 


GAGGCGCCAC 


ATTTTTAGGT GCATCTATTT 


TCATAGACCT 


AAATTTCAAA 


ATATAGAAGT 


14940 


ATTCTTTATG 


AATACCAAAG 


CGTTTTTGCT 


TAAATATAAC 


CGGACCTTCT 


GAATCAAGTT 


15000 


TAATCGCAAT 


TGCAATTATC 


ATAAAAACCG 


GACACAATAT 


TATTATCCCT 


ATTAAAGATA 


15060 


ATAATATATC 


ACCTAATCGT 


TTTATTATAC 


CGTACATAAA 


CAACCTCCAA 


CTATAAATTC 


15120 


TATTTCCATT 


TTTCATTCTA 


TTTCCATTTG 


ACAAATTAAA 


TCACGCAGTA 


CATGCAACTA 


15180 


CAGAAACTCA 


ATATATATTT 


GGTCACTCAA 


TGATTTTCAG 


AAATATAATT 


CTTTTATCCT 


15240 


CTACGTCAGA TAAAACTTTT CTCCATCTAA ACAAAATTTA TTTGTTTCAG 


TAATATATGA 


15300 


GTTCTCAATA 


ATGAATTAGA 


AGGTCCAGTT 


CAATTATTCT 


TCCAAATAGA 


CCGAATATTA 


15360 


TTTGAAGACA 


TATCGGTTTC 


TGAAATTGCA 


ATCAGTACAT 


AAGCTAATAA 


ACTGATAAGT 


15420 


ATGCTCTGTA 


AGAATGCCAG 


AGTTATATTG 


TAGTCCCCTT 


CCATACTATA 


TTCATTTTAT 


15480 


TTTTTACCAT 


AATTTCCATA 


GGAACCGTAA 


ACTCCATACT 


TATTAACCGA 


GATATCCAAT 


15540 


TTATTTAAAA 


CAACTCCTAG 


GAACAGTTTC 


CCTGTTTGTT 


TTAATTGTTG 


TTTCGCTTTT 


15600 


TGGATATCAC 


GTTTATTCGC 


CTCACCTGTT 


GCTGTTACCA 


AGATGGACGC 


ATCACACTTT 


15660 


TGAGTGATAA 


TTGCCGCATC 


AATAACAATT 


CCAATAGGCG 


GTGTATCAAT 


AATGATATAA 


15720 


TCAAAATATT 


TACGCAATGT 


TTCAATCATA 


TCATTAAAAT 


TTTTACTTTG 


TAACAAGGCT 


15780 


GTAGGGTTTG 


GTGATACAGA 


TCCCGATTGA 


ACTACAAATA 


AATTTTCAAT 


ATTTGTATCA 


15840 


CATAAACCGT 


GAGATAAATC 


AGCTGTCCCA 


GATAAAAATT 


CTGTTAGCCC 


TGTAATT1 rT 


15900 


TCACGAGATT 


TAAAAACTCC 


TAACATAACT 


GAATTTOGAG 


TATCGCCATC 


GATCAAAAGA 


.15960 


GTTTTATAGC 


CTGCACGCGC 


AAACGACCAT 


GCTATATTTA 


TGGAAGTAGT 


TGTTTTTCCT 


16020 


TCCCCAGGGT 


TAACAGAAGT 


AACGGAAATT 


ACTTTTAGTT 


TATCTCCGCT 


CAACTGTATA 


16080 


TTTGTACACA 


AGGCATTGTA 


ATATTCTTCT 


GCCTTCTTAA 


TGAACTCCAG 


TTTTTTTTGT 


16140 


GCTATTTCTA ATGTCGGCAT 


CCTTCTCTCC 


TATTTCAACT 


TACCCAAGTT 


TGGCACAACT 


16200 


CCCAAAAGTG 


TCATCTGCAA 


TGTATTTTCG 


ATATCTTCCG 


GACGTTTCAC 


ACGAGTATCC 


16260 


AAAAGTTCAA GATGAAGAAC 


TATAACACTA GTTCCAATCA CCCCTGCCAA AAAACCAATT 


16320 



WO 98/18931 



PC1YUS97/19588 



AGTGTATTGC 
GTTGTCACGT 
GAGTTAGCGA 
CGGGTATCAA 
AGTTTCAAAT 
- TCTTTTACCA 
TGATTGCGAT 
GTGCTATATG 
CGTTTCCACA 
ATCATTTCTC 
CCTGAGCCTT 
GAGGTCTACC 
AAAAATACTG 
GGACATGTGA 
TTTCATTATT 
ACATCAAGAT 
CTATCAAGGC 
GAACATCTGG 
CCTTAGCTAT 
ACATGCCCTT 
CTGCCAAGAG 
TATGCGAATG 
ACTACAGCTA 
TAAGAAGGAA 
CCTCCACTTT 
TGGATAGAAT 
GTTAATTTTT 
CCATCTGCTA 
ACATTGCCTG 



GTTTAATATT 
CAGAAACACG 
TACGGCTTGC 
CTGGTACTGT 
CAGAAACAAC 
GATAAGTTCC 
TCACTACGTA 
CAAAAGCCCC 
AGCTTTTAAC 
CTAAATTAGT 
CGCTTCTCCG 
GTCTAGATTG 
AGCTCTTTTT 
ACTATTTACT 
TTCAAGAGCA 
CTTGCTCAAG 
ATAACGACTA 
TGTGTAATAA 
TTCCCGAACC 
GCGACGGTGA 
AGCCTTGCTT 
GATGTCTATC 
AACTACTATC 
GATCCATCCG 
CTAACTGAGC 
CTTGCAAGCT 
GAAGGATAGC 
GGGAGTAGCG 
CAGGGTAATA 



TGGCGAAGAC 
AGTAATACTG 
CTOTPCAGGA 
CACTTTAATT 
TTCCTCCAAA 
TGCCTGCAAA 
AATTCGCGTG 
CGCACCTGTC 
TAATTGAAAT 
TGATCCATTA 
TATTTTTGGG 
TGCATATCAC 
TTCATGAATT 
TGCGTGTAAC 
TCATAGCGCT 
GCGCTATGAA 
TCATTGAGGG 
ATTTCAGCCC 
TGAAGAAAGT 
GAGGTAGAAA 
TCCTCTCTTG 
ATTTCATCTA 
ATCTATTTCC 
ACCTGTCCCT 
ATTGACCAAA 
ATTAATGATC 
CACAATCACC 
CTCACGAACA 
CTTTCCATTC 
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GGGGATATCG 
ATAATTTTTT 
ACTCGATCAT 
TT ATT AG CCA 
ACATCCTGCG 
TCCTGATTTG 
GTACTCGTAT 
ACAAGTGCCA 
ACATCGATTT 
CAATTTTTCG 
TAACAAGGTC 
TTGCAATGAC 
TATAACGTTC 
AGCCCATATC 
CAATGTGGGC 
TATCGCGATA 
TCGGAATCCG 
CGTAAGCAAT 
TTTCTGCTAT 
CAATGGTTCG 
ACTTGGGACC 
CCCTCCATCA 
ATCACATAGA 
TTTAAATCTT 
TTTATCATGG 
GTACTATAAT 
TTTTGTTGAT 
AAACCGAGAG 
GTATGGGCAG 



CCGGCCTTGC 
GAGCAGCTAC 
TAACTGAAAT 
AACCTTTTGG 
AAAGGATAAT 
TCAACCCCGG 
ATTCTGGCTT 
CTATTAAAAT 
CTATCGTATT 
AGGATTGTCT 
ATATGCTTCT 
ATGAACCAAA 
GCCAAAAAGT 
GATCAGTTCT 
AATGACTGGA 
AGGAGTGTTC 
CTTTTTTTCC 
GACCAAGTCA 
CTTCTCTTCC 
CACCCCCTGT 
GTCATCTACA 
CATCCTGTAT 
GGTTACTGTC 
GAGAATTTAC 
TCTCAAGTGG 
TTTTCAGCAC 
GGCGCCCGCG 
CCTGTTCTGA 
TAAATTCTTG 



CTCCTCCAGT 
TTCTCTCAAA 
AGAGACAATA 
CGTCAAATCT 
CTCACGGTAG 
CTTGTCTCCT 
AACAATAAAA 
CATTAGCTTG 
TTGTTCTTTC 
ATAAAAAGTT 
GCCATATGAG 
TCCTGCTCTA 
TTGGGTTTGA 
CGAACGCGTT 
GTAATTCCCA 
ATACTAAACT 
AGCTTATCCA 
CTCGCCACTT 
GGAGTTTCAA 
CTGTAGGATT 
TCAAAAACGA 
AGCTGCTTTA 
TGGCATTGCA 
TTTATAATTC 
CATATTTGTT 
TTCGGTTGAC 
GTCACGATCG 
ATCAAGATGA 
ATCATTATAA 



16380 

16440 

16500 

16560 

16620 

16680 

16740 

16800 

16860 

16920 

16980 

17040 

17100 

17160 

17220 

17280 

17340 

17400 

17460 

17520 

17580 

17640 

17700 

17760 

17820 

17880 

17940 

18000 

18060 
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ACATCAATTC CACCCAACAA ATCAATCAAT TTCAAAAACG AAGTGAAGTT CAATCGCACA 18120 

TAGTAATTGA TATCCACTCC ATAGAGATTT TCTAAGGTGT GAATGGACGA ATCAACTCCA 18180 

TAAATGCCCG CATGAGTCAA TTTATCTTTT TGATTATTTC CACCATCTGC GATTGGTACA 18240 

TAGGCATCAC GTGGCGTTGT GGTCAAGAGG ATTTTCTTGG TATCTCGATT GACAGTCATC 18300 

AGGATGTTGA CATCTGATCG CGACACCGAA CTAATAGGAC CATAGGTGTC AATTCCACTA 18360 

ACATAGATAT TGAAAGACTG ACTCTTAGAC GTCTTAGGAG CTTCTACTTT TTTAGTGAAT 18420 

CCCTTAGTAT AAATCTTTTT TATCTTCGAT GCGTAGTCTG GATACTCTGA CTCGATGATG 18480 

TTTTCAAAGA CACTATTTAG GACAATGGCC TTAGTCTCCC CTGCAATCAA ACTCTTGTAA 18540 

GCTGCCAAGT AAGACGAACT CTGGTTGACC GTCAAATCGG TATTCTGACT TGACTTGATA 18600 

TCAGCTAGTA ATTTCTGAAT ATTTTCATTA TTAGTCCCAG TCGGTGCTGT CACACTCGTC 18660 

AGTTGCGTAA CATTTTCGAT CTCACTATCT GCTAAAACAG CGACACTGAT TGAATATTCT 18720 

GAGTAATTAG AAGTCGCATT TAAACGATTG GTCAGTCCAA CAAACTGCTG TACTGCAAAG 18*780 

AGCGACACAG AGCTGACAAG GATAGAGAAC ACCAACAGAA AAATAGTAAA CTTTTCAGCT 18840 

TTTTTATAGA TAATCAAGAG TAGCCCTACC AAGGCAACTA GTAGGACTAA CGCAGTTACC 18900 

ACTAGATTAA GATATCTAAA AGCAAGGATA TTGTACTTAA AGATTAAGAA CAATAAAAAA 18960 

CAAACTAACA ATAAATAAAT AGTCAGCAAA ACTATATTAA CACTTCGCTT CACTTTCTGT 19020 

GAACGTGATT TTTTAAAACG TCTACTCATG ATTAATACCT ATACATTGAA CATTATACGA 19080 

TTATATCACT TTTTTACGGT AATGTCTACA CCTTTATTTT TACTATCTGC ATCTTTAAGT 19140 

ATCTTAGTAG ACTTCCCGCG AAACAAAAAT ATAGTAAAAT GAAATAAGAA CAGAACAAAT 19200 

CGTTCAGGAC AGTCAAATCG ATTTCTAACA ATGTTTTAGA AGCAGAGGTG 19250 
(2) INFORMATION FOR SEQ ID NO: 36; 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 21706 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

AAAGTTGAAA GACTGCTAGC TGTTTTTGAT ACCAATCGTT TCCAACTACA GAGCAAACAG 60 

TATACAAAGT TTGTTTTTGG ATGTAAGCTT CTTGATGGAC AATTCCAAGA AAATCAAGAA 120 

ATTGCTGACC TTCAATTTTT TGCCATTGAC CAACTGCCGA ACTTATCTGA AAAACGCATT 180 

ACCAAGGAGC AAATAGAGCT TCTTTGGCAG GTTTATCAAG GTCATAGGGG GCAATATCTT 240 
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GACTAAGAAG 
ATATGCAGGA 
ACGACTCAGG 
CTCTTAGCCT 
TTTTTCCAAA 
CATATGCCAC 
CTCTTGCTTC 
ACATTAGCAG 
CTATCCTTTA 
CAGTATAATG 
GCCTTGATTT 
CCATTGGTTG 
GCTACTATCA 
ATGATTGCAA 
GCAGCTTATA 
GAAAAAAGCA 
TGTTTAGCTA 
TGGCTAGAGA 
TGGCCGAGCA 
GTGTAAAGGA 
TTTCCCAAGC 
GTAATCTTTT 
AAATTACGGA 
CTGACCGTGA 
TTCGTGACCC 
GACCTCGTAC 
AAAATGTTAT 
AATTTGGTGC 
ATCAAGAACT 



ATGATTATCG TATTTCTAAA TCCATTTTTA ACAACTAGCA 
AAATTTTGAA TTATGAGGAA GACTAGATGA ATTTATGGGA 
CAACCGAGCC GCCCAAATTT GACCTTTTTT GGTATGTTAG 
TAACCTTTTA TACAGCCCAT CGCTATCGTG AAAAGAAGGT 
TCTTGCAGAC TGTTCAGTTA ATCCTTCTTT ATGGTTGGTA 
TGTCAGAAAG CCTACCCTTT TACCATTGCC GTATGGCTAT 
CTGGTCAATC CAAATATAAA CAATACTTTG CATTATTGGG 
CCTTTGTTTA TCCAGTGCCA GATGCTTACC CTTTTCCACA 
TCTTTGGTCA TTTAGCACTC TTGGGGAACT CTCTAGTTTA 
CGCGATTGCT GGATGTGAAG GGAATTTTTC TCATGACCTT 
TTGTGGTCAA TTTGGTGACA GGTGGCGATT ACGGATTTTT 
GGGATCACGG TCTAGTAGCT AATTATTTAC TTGTTTCAAT 
GTTTGACTAA GAAAATCTTA GAATTCTTTT TAGCTCAAGA 
AGGAAGCTTA ACACAGAGCT TTCTTTTTTG CTCTTAGAGA 
AAATAAGAAT TTCTGAATAG ACAAACTCAA AAAATGGCTG 
AGCACGATTA AATTTTTTGT GTTATAATAT TTTGTGAATA 
TGGAATAATA CGAAGTGCGA AACTTGGAAG ATAGAGAGGA 
AGGCTTTTTT ACAGGTCTAG ATATTGGAAC AAGCTCTGTC 
GAGAAATGGT GAATTAAATG TAATTGGCGT GAGTAATGCC 
TGGAATTATT GTTGATATTG ATGCAGCAGC AACTGCTATC 
GGAAGAAAAG GCAGGCATTT CGATTAAATC AGTGAATGTC 
GCAGGTAGAA CCAACTCAGG GGATGATTCC AGTAACATCT 
TCAAGATGTT GAAAATGTTG TCAAATCAGC TTTGACAAAG 
AGTCATTACC TTTATTCCTG AAGAATTTAT TGTGGATGGT 
ACGTGGCATG ATGGGGGTTC GCCTTGAAAT GCGTGGTTTG 
TATCTTGCAC AATTTGCGTA AGACGGTTGA GCGTGCAGGT 
CATTTCACCA CTAGCAATGG TTCAGTCTGT TTTGAACGAA 
TACAGTGATT GATATGGGGG CAGGTCAAAC GACTGTCGCT 
CCAGTTCACA CATATTCTCC AAGAAGGTGG AGATTATGTA 



TGGTATAATA 
TATTTTCTTT 
CCTATTTACG 
TTACCAACGA 
CTGGGTCAAT 
GTTTGTGGTA 
AACATTTGGG 
TATCACCATT 
TCTATTGAGA 
TGCCCTAAAT 
GACAAAACCG 
TGTGCTGGTA 
AGCAGAAAAA 
GTTTTTACAA 
GGAAATTTAG 
GCTATGCCTA 
AGCGATGTAA 
AAGGTGCTTG 
AAAAGTAAAG 
AAGTCAGCCA 
GGCTTGCCTG 
GATACTAAGG 
AGTATGACAC 
TTCCAAGGGA 
CTTTATACAG 
GTTCAGGTTG 
GGGGAACGTG 
ACAATCCGTA 
ACTAAAGATA 



300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
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— GG T r^ CC TCTCGCAAAT T AGC^GG 

-CCT^c GCCTC^CA AGCAAAGAAA CC^CAAG, AGAGGTTATT GGAGAAGTAG 
AAGCAGTCGA AGTGACGGAA CCCTAC™ CAGAAATTAT TTCTGCACGA ATCAAGCACA 
TCCTTGAACA AATCAAGCAA GAATTAGATA GAAGGCGTCT A^GGACCrc CC^. 
TTGTCTTAAT CGGTGGGAAT GCCA^c CAGGTATGGT TG^C. CAGGAAGTCT 
TGTCAAGCTT TATGTTCCAA ATCAAGTTGG TATOCGTAAT CCAGCCT^ 
CCCA^ TAGTTTATCA GAATTTGCGG GTCAATTAAC AGAAGTTAAT C^G^C 
AGGGAGCGAT AAAAGGTGAG AATGACTTAA GT^AGCC AATTAGTTTT GGTGGGATGC 
TGCAAAAAAC AGCTCAGTTT GTACAATCAA CGCCTGTTCA ACCAGCTCCT GCTCCAGAAG 
-OACCCGT GGCGCCTACA GAACCAATGG CGGA^CCA ACAAGCTTCA CAAAATAAAC 
CGAAATTAGC AGATCGTTTC CGTGGATTGA TCGGAAGCAT GTTTGACGAA TAAAGAGGAA 
AAATAAATTA TGACATTTTC ATTTGATACA GCTGCTGCTC AAGGGGCAGT GATTAAAGTA 
ATTGGTGTOG GTGGAGGTGG TGGCAATGCC ATCAACCGTA TGGTCGACGA AGG^c, 
GGCGTAGAAT TTATCGCAGC AAACACAGAT GTACAAGCAT TGAGTAGTAC AAAAGCTGAG 
ACTGTTATTC AG^GGAcc GTGCAGGAGG TCAACC^G 

GTTGGTCGTA AAGCCGCTGA AGAAAGCGAA GAAACACTGA CGGAAGCTAT TAGTGGTGCC 
<*TA TC G TCT TCATCACTGC TGGTATGGGA GGAGGCTCTG GAACTGGAGC TGCCC^ 
ATTGCTCGTA TCGCCAAAGA TTTAGGTGCG cn^TTO GTGTTGTAAC ACGTCCC^ 
CC™^ GAAGTAAGCG TGGACAATTT GC^GAAG GAATCAATCA ACTTCGTGAG 
CATGTAGACA CTCTATTGAT TATCTCAAAC AACAATTTGC TGATAAGAAA 
ACACCGCOTT TGGAGGCTCT TAGCGAAGCG GATAACGTTC TTCGTCAAGG TGTTCAAGGG 
ATTACCGATT TGATTACCAA TCCAGGATTG ATTAACCTTG ACTTTGCCGA TGTGAAAACG 
GTAATGGCAA ACAAAGGGAA TGCTCTTATG GGTATTGGTA TCCGTAGTGG AGAAGAACCT 
GTGGTAGAAG CGGCACGTAA GGCAATCTAT TCACCACC TTGAAACAAC TATTGACGGT 

CA TC TTATCGTCAA CGTTACTGGT GGTCTTGACT TAACC^ 
aGGWCAC CCAGGCAGCA GGTCAAGGAG TGAACATCTG GCTCGGTACT 

TCAATTGATG AAAGrATGCG TGATGAAATT CGTGTAACAG TTGTTGCAAC GGGTGTTCGT 
CAAGACCGCG TAGAAAAGGT ^GCCTCCA CAAGCTAGAT C^ACTAA CTACCGTGAG 
ACAGTGAAAC CAGC^c ACA^ GATCGTCATT ™ ATATGGC ^ 
GAATTGCCAA AACAAAATCC ACGTCGTTTG GAACCAACTC AGGCATCTGC TTTTGGTGAT 
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TGGGATCTTC 
CGCTTTGAAG 
AATCGTTAAG 
GGCTAGTCTG 
TGTAGATGTA 
- TCGTGTAGAT 
TTTGATTGGT 
CCATGCATTG 
CAAGTGTTTC 
GGAACTGCTG 
AATGACGATG 
CCAAGATTTA 
AAGTATGGGA 
TATAGGTACA 
ATTTATAGAT 
GCCTGTGTTT 
ACAGTCGGCT 
GGCAAATCAG 
ATATGAGGAT 
TTTTCAGTAT 
TCATGTTTTA 
GAACGTTATT 
CGGTTTTGAT 
AATGCAGTGG 
GGTGCCTACG 
CCCTTGCAAC 
GTTTTGGTTC 
AAGGGATTTA 
GGATAAAGAA 



GCCGTGAATC 
CCCCAATTTC 
TAAATGAATG 
AGTGCTCATC 
CCGACAGCGG 
AAGTTTCTGG 
ACCTTGCAAA 
GACTCAGTAA 
CTTCAAGTAA 
GAAATCTTGC 
GCACCTTTTG 
CAAAGAGAAA 
ATGAGTCGTG 
TCATTTTTTA 
TATTTTACGG 
ACTTCAGTAA 
GGCACAAAAG 
AGTCAGCGTG 
GCAACAGAAA 
ATGACAGAGG 
GCTGGAAATT 
GTAAATGTTG 
ATGAAGCGAA 
ATATTTACTC 
AATCCAGTTT 
GCCTGCCTTT 
GATTTTTAGG 
TCAGCATTTC 
GGTAGAAGAT 



GATTGTTCGT 
ACAAGATGAA 
TAAAAGAAAA 
GAGAGAGTGG 
AAGCCTTGCT 
AAAAATATGA 
GACGTAAGGT 
AGCTAGCAGG 
ATATTTCTAA 
CAGAGTTAGC 
AGGCTAGCAG 
TTCAAGAGAA 
ATTATAAAGA 
AGTAGGAGAG 
AGGATGAGGA 
ATTCTTCACA 
AGAACAATAT 
CAACGGATAA 
TTGTTGATTT 
TGCAGGCTCG 
TGAAAAAGGT 
AAGATATCCG 
ATAGAGTACG 
CCTGATTTTG 
AGGTCGTTGG 
ACAGATAGCG 
AGAAAACCTA 
TCCATAGAAG 
AGCTATGCTC 



361 
ACAACAGATT 
GATGAATTGG 
TACAGAACTT 
TTCGGTCTCT 
TCCGCTAGGT 
AGCTTTAAAA 
GAAAGATGTC 
GGAAATTCAA 
AGAAGAAAGC 
CAGACTAGAT 
TGAGCAGTTG 
ACAAATTCCA 
AGCGATTCAA 
AACCATGTCT 
TTCAAGTCTC 
GGAACCGGCT 
CACCAGACTT 
GGTCATTATA 
ATTGGCAGGA 
TCGTTGTTTG 
AGCTTCTACC 
TTTACCAGAT 
ATAATGATTT 
GTAGCCTTCG 
ATTGTAGCGT 
GGTCTTGATT 
GTGCGTTTTC 
ATCGTCCATT 
CTTTTTTAAC 



CAGTCGTTTC 
ATACACCTCC 
GTTTTTCGAG 
GTCATTGCAG 
GTCCATCATA 
GATCGAGATG 
ATTCAATACG 
AAAAGAAGTG 
AAACACGGTT 
AAGATTGAAT 
AAAGAGATTT 
AATATGCCTA 
TTCGGTTCCA 
TTAAAAGATA 
CCTTATGAAA 
CTCCCAATGA 
CATGCAAGAC 
GATGTTCGTT 
AACGAAAGTA 
GACTATTTGG 
ATGTATTTGT 
GAAGATCAAC 
TTTTAATTCG 
CTGTCATGTC 
TGGTGAAACC 
TATCTGTTTG 
TGGCGATGAT 
TCTTGACAAG 
TCCTTTTATC 



TCCAGTCGAG 
ATTTTTCAAA 
AAGTTGCAGA 
TTACCAAGTA 
TCGGTGAAAA 
TGACTTGGCA 
TTGATTATTT 
ACCGAGTCAT 
TTTCGAGAGA 
ATGTTGGTTT 
TCAAGGCGGC 
TGACCGAGTT 
CTTTTGTTCG 
GATTCGATAG 
AAAGAGATGA 
ATCAACCTTC 
AACAGGAATT 
ATCCTAGAAA 
TCTTGATTGA 
ATGGAGCTTG 
TGACACCAGT 
AGGGTGAGTT 
TATGATTTAT 
TTGGTTTCCA 
AGTGCTTGCT 
GGTTGCGATT 
AGGATGAATA 
GGAATGGAAT 
AATCCTCATC 



3840 
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AGGAGAAGCT ATTAAAGATT TTGGCCAAAA CCTATGGTCT TGCTTGTAGC AGTAGTGGGG 5580 

AATTCGTCTC GAGTGAGTAT GTTCGAGTTT TATTATACCC AGATTATTTC CAACCAGAGT 5640 

TTTCAGATTT TGAAATATCT CTCCAGGAAA TTGTGTATTC CAATAAATTT GAACATTTAA 5700 

CGCATGCTAA GATTTTAGGG ACAGTCATCA ATCAATTAGG GATTGAACGG AAACTTTTTG 5760 

GAGATATCCT AGTAGATGAA GAACGGGCGC AGATTATGAT TAATCAGCAG TTTCTTCTTC 5820 

TCTTTCAAGA TGGACTAAAG AAAATTGGTC GTATACCTGT TTCGCTGGAG GAACGTCCTT 5880 

TCACCGAGAA AATAGATAAG CTAGAACAGT ATCGAGAACT GGATTTATCT GTGTCTAGTT 5940 

TTCGATTAGA TGTTCTTTTA TCAAATGTTT TGAAACTATC TAGGAATCAA GCAAACCAGT 6000 

TGATTGAAAA GAAACTTGTC CAAGTAAATT ATCATGTGGT AGACAAATCA GATTACACTG 6060 

TTCAAGTTGG AGACTTGATT AGTGTGAGAA AATTTGGTCG CTTGAGATTA CTTCAAGATA 6120 

AGGGACAAAC GAAAAAAGAG AAGAAAAAAA TAACCGTCCA GTTATTATTA AGTAAGTGAG 6180 

GAATAGAATG CCAATTACAT CATTAGAAAT AAAGGACAAG ACTTTTGGAA CTCGATTCAG 6240 

AGGTTTTGAT CCAGAAGAAG TCGATGAATT TTTAGATATT GTGGTTCGTG ATTACGAAGA 6300 

TCTTGTGCGT GCGAATCATG ATAAAAATTT GCGTATTAAG AGTTTAGAAG AGCGTTTGTC 6360 

TTACTTTGAT GAAATAAAAG ATTCATTGAG CCAGTCTGTA TTGATTGCTC AGGATACAGC 6420 

TGAGAGAGTG AAACAGGCGG CGCATGAACG TTCAAACAAT ATCATTCATC AAGCAGAGCA 6480 

AGATGCGCAA CGCTTGTTGG AAGAAGCTAA ATATAAGGCA AACGAGATTC TTCGTCAAGC 6540 

AACTGATAAT GCTAAGAAAG TCGCTGTTGA AACAGAAGAA TTGAAGAACA AGAGCCGTGT 6600 

CTTCCACCAA CGTCTCAAAT CTACAATTGA GAGTCAGTTG GCTATTGTTG AATCTTCAGA 6660 

TTGGGAAGAT ATTCTCCGTC CAACAGCTAC TTATCTTCAA ACCAGTGATG AAGCCTTTAA 6720 

AGAAGTGGTT AGCGAAGTAC TTGGAGAACC GATTCCAGCT CCAATTGAAG AAGAACCAAT 6780 

TGATATGACA CGTCAGTTCT CTCAAGCAGA AATGGCAGAA TTACAAGCTC GTATTGAGGT 6840 

AGCCGATAAA GAATTGTCTG AATTTGAAGC TCAGATTAAA CAGGAAGTGG AAGCTCCAAC 6900 

TCCTGTAGTG AGTCCTCAAG TTGAAGAAGA GCCTCTGCTC ATCCAGTTGG CCCAATGTAT 6960 

GAAGAACCAG AAGTAGCTCC AATGCATCCG ATAGGTCCAA CACCAGCTAC AGAAACTGTT 7020 

GATTCAATAC CGGGATTTGA AGCACCGCAA GAATCTGTTA CAATTTTATA AGAAATATTC 7080 

TGAGAACAAT ATCTTATCCT TATATTTCCA GCGAGCAGGA GATGGTGTGA GTCCTGTAAT 7140 

CCCTATTGAT AAGATTATCC TCTCAAAAAC TCAAGTCTGA AGCTAGTAAG ATTTGACGTT 7200 

TCCCACGTTA CGGGATAAGA GGGAGAAAGA CTAAATCTTT TTCCGAATAA AGGTGGTACC 7260 

ACGATTTTCG TCCTTTTTGG AAGTCGTGGT TTTTAATTTG TTATTATTTA TAAAGGAGAT 7320 
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ACCATGAAAC TCAAAGACAC CCTTAATCTT GGG AAAACTG AATTCCCAAT GCGTGCAGGC 7380 

CTTCCTACCA AAGAGCCAGT TTGGCAAAAG GAATGGGAAG ATGCAAAACT TTATCAACGT 7440 

CGTCAAGAAT TGAACCAAGG AAAACCTCAT TTCACCTTGC ATGATGGCCC TCCATACGCT 7500 

AACGGAAATA TCCACGTTGG ACATGCTATG AACAAGATTT CAAAAGATAT CATTGTTCGT 7560 

TCTAAGTCTA TGTCAGGATT TTACGCACCA TTTATTCCTG GTTGGGATAC TCATGGTCTG 7620 

CCAATCGAGC AAGTCTTGTC AAAACAAGGT GTCAAACGTA AAGAAATGGA CTTGGTTGAG 7680 

TACTTGAAAC TTTGCCGTGA GTACGCTCTT TCTCAAGTAG ATAAACAACG TGAAGATTTT 7740 

AAACGTTTGG GTGTTTCTGG TGACTGGGAA AATCCATATG TGACCTTGAC TCCTGACTAT 7800 

GAAGCAGCTC AAATTCGTGT ATTTGGTGAG ATGGCTAATA AGGGTTATAT CTACCGTGGT 7860 

GCTAAGCCAG TTTACTGGTC ATGGTCATCT GAGTCAGCAC TTGCTGAAGC AGAGATTGAA 7920 

TACCATGACT TGGTTTCAAC TTCCCTTTAC TATGCCAACA AGGTAAAAGA TGGCAAAGGA 7980 

GTTCTAGATA CAGATACTTA TATCGTTGTC TGGACAACGA CTCCATTTAC CATCACAGCT 8040 

TCTCGTGGTT TGACGGTTGG TGCAGATATT GATTACGTTT TGG TTCAACC TGCTGGTGAA 8100 

GCTCGTAAGT TTGTCGTTGC TGCTGAATTA TTGACTAGCT TGTCTGAGAA ATTTGGCTGG 8160 

GCTGATGTTC AAGTTTTGGA AACTTACCGT GGCCAAGAAC TCAACCACAT CGTAACAGAA 8220 

CACCCATGGG ATACAGCTGT AGAAGAGTTG GTAATTCTTG GTGACCACGT TACGACTGAC 8280 

TCTGGTACAG GTATTGTCCA TACAGCCCCT GGTTTTGGTG AGGACGATTA CAATGTTGGT 8340 

ATTGCTAATA ATCTTGAAGT CGCAGTGACT GTTGATGAAC GTGGTATCAT GATGAAGAAT 8400 

GCTGGTCCTG AATTTGAAGG TCAATTCTAT GAAAAGGTAG TTCCAACTGT TATTGAAAAA 8460 

CTTGGTAACC TCCTTCTTGC CCAAGAAGAA ATCTCTCACT CATATCCATT TGACTGGCGT 8520 

ACTAAGAAAC CAATCATCTG GCGTGCAGTT CCACAATGGT TTGCCTCAGT TTCTAAATTC 8580 

CGTCAAGAAA TCTTGGACGA AATTGAAAAA GTGAAATTCC ACTCAGAATG GGGTAAAGTC 8640 

CGTCTTTACA ATATGATCCG TGACCGTGGT GACTGGGTTA TCTCTCGTCA ACGTGCTTGG 8700 

GGTGTTCCAC TTCCTATCTT CTACGCTGAA GATGGTACAG CTATCATGGT AGCTGAAACT 8760 

ATTGAACACG TAGCTCAACT TTTTGAAGAA TATGGTTCAA GCATTTGGTG GGAACGTGAT 8820 

GCCAAAGACC TCTTGCCAGA AGGATTTACT CATCCAGGTT CACCAAACGG CGAGTTCAAA 8880 

AAAGAAACTG ATATCATGGA CGTTTGGTTT GACTCAGGTT CATCATGGAA TGGAGTGGTG 8940 

GTAAACCGTC CTGAATTGAC TTACCCAGCC GACCTTTACC TAGAAGGTTC TGACCAATAC 9000 

CGTGGTTGGT TTAACTCATC ACTTATCACA TCTGTTGCCA ACCATGGCGT AGCACCTTAC 9060 
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AAACAAATCT TGTCACAAGG TTTTGCCCTT GATGGTAAAG GTGAGAAGAT GTCTAAATCT 
CTTGGAAATA CTATTGCTCC AAGCGATGTT GAAAAACAAT TCGGTGCTGA AATCTTGCGT 
CTCTGGGTAA CAAGTGTTGA CTCAAGCAAT GACGTGCGTA TCTCTATGGA TATCTTGAGC 
CAAGTTTCTG AAACTTACCG TAAGATTCGT AACACTCTTC GTTTCTTGAT TGCCAATACA 
TCTGACTTTA ACCCAGCTCA AGATACAGTC GCTTACGATG AGCTTCGTTC AGTTGATAAG 
TACATGACGA TTCGCTTTAA CCAGCTTGTC AAGACCATTC GTGATGCCTA TGCAGACTTT 
GAATTCTTGA CGATCTACAA GGCCTTGGTG AACTTTATCA ACGTTGACTT GTCAGCCTTC 
TACCTPGATT TTGCCAAAGA TGTTGTTTAC ATTGAAGGTG CCAAATCACT GGAACGCCGT 
CAAATGCAGA CTGTCTTCTA TGACATTCTT GTCAAAATCA CCAAACTCTT GACACCAATC 
CTTCCTCACA CTGCGGAAGA AATCTGGTCA TATCTTGAGT TTGAAACAGA AGACTTCGTC 
CAATTGTCAG AATTACCAGA AGTTCAAACT TTTGCTAACC AAGAAGAAAT CTTGGATACA 
TGGGCAGCCT TCATGGACTT TCGTGGACAA GCACAAAAAG CCTTGGAAGA AGCTCGTAAT 
GCAAAAGTTA TCGGTAAATC ACTTGAAGCA CACTTGACAG TTTATCCAAA TGAAGTTGTG 
AAAACTCTAC TCGAAGCAGT AAACAGCAAT GTAGCACAAC TTTTGATCGT GTCTGAGTTG 
ACCATCGCAG AAGGACCAGC TCCGGAAGCT GCCCTTAGCT TCGAAGATGT AGCCTTCACA 
GTTGAACGTG CTACTGGTGA AGTATGTGAC CGTTGCCGTC GTATCGACCC AACAACAGCA 
GAACGCAGCT ACCAGGCAGT TATCTGTGAC CACTGTGCAA GCATCGTAGA AGAAAACTTT 
GCGGAAGCAG TCGCAGAAGG ATTTGAAGAG AAATAAGATT GAAAAGTCTA GGCAAAATTC 
AATTTGAGAA GAAAAGACAA CTAATTTTAT AGTCTATTAA ACGCATTGTA TCACGTTTTT 
GAATACCTGA TATGATGCGT TTTTTATTTA TTTTAAAAAT TTGCGAGGTA TGACTTTTTA 
TACTCAACAA GAATCAAAGA GAAACTTAGC AAGCTAACAG TAGTAAGATA AAATAGGAAT 
TTGATATTAG GGATAAGATT GGTAAATAGT GTAATATTTT TACAACAATA AATTTATATA 
GTTATTTCTG GTTTCTGAAA AGTATTATAT TTTATTTCAT ATTATACAAA TTTTTATTTT 
ATAATATCAG AACATACTTT TTTTAAAAGC AAATATGATA CAATTTTATT TGAAAAAAAT 
AAAAAAGGAG ATTTTATTAT AAAATTAAAA AGACTTGCTT TAATTAGTGG TATCGTCGGT 
CTTGTGGGAG GAATTTTACT TCTTATTGGT CCTTTTGTCT TGTTGGGAAT AGCGGTAAAC 
ACAGCTGCTA CAACTCTTAA TGGAGGAGCT ACTGCAGGGG CTTTTTCAGG TGTAGCCTTA 
CTCTTGAATG CCTTGAAGAT TGCAAATCTT GTTCTTGGTA TCATTGCTAT TGTTTACTAT 
AAAGGAGATA AGCGTGTAGG TGCAGCTCCG TCTGTACTAA TGATTGTTTC TGGTGGAGTT 
AGTCTCATTC TATTCCGTTC TTAGGATGGG TTGGGGGGAT TTTTGCTATT ATCGGAGGAT 



9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 



WO 98/18931 



PCT7US97/19588 



365 

CTCTATTCCT TTCAACATTG AAGAAATTCA AATCAGAAGA ATAAAAGGTA TTTTAGCATG 
AAAAGAACAA AAAAGTTTAT CGGTATAGGA GTAGCTCTAT TATCTCTTTC TCTTCTAGTT 
GCATGTGGAA CATAAAGTTC AAAGAATACT TCAACAAGTA ATGATGAGAA GACAGTAGCA 
ACATCCAATA GTTCAAAAGA AACAATCACT TTCGATACAC CGGTTGTAAC AGACGATGCG 
ATTGAATCAA TACGCACTTA TGCAGATTAT ATAGATCTTT ATAAAAATAT TTTTGATGAT 
. T ATTTT ACT A AAGCTGAGGA AGGTTTCAAA GGCATAGCTA TGGAAAATAA TGACTCGTTT 
ACTAAACTAA AAGAGTCAAC TCAAAAATTA TTCGATGCGC AGAAAAAAAG GTTAAATAAT 
GAAGATAGAA TAGAAACAAC CAAAAACAAT GTGATTGCCA AACATTGTCA AACAGTCCTT 
TCCTTTTTGG TTTTGACTAG CTTTTTTGTG AAAAATTGTG TAAAATAGAA TAGATAAACG 
AGGGGAAACC TCGGAAAATT TAAAGGAGAA TCCATCTAAT GGTAAAATTG GTTTTTGCTC 
GCCACGGTGA GTCTGAATGG AACAAAGCTA ACCTTTTCAC TGGTTGGGCT GATGTTGATT 
TGTCTGAAAA AGGTACACAA CAAGCGATTG ACGCTGGTAA ATTGATCAAA GAAGCTGGTA 
TCGAATTTGA CCAAGCTTAC ACTTCAGTAT TGAAACGTGC TATCAAAACA ACTAACTTGG 
CTCTTGAAGC TTCTGACCAA TTGTGGGTTC CAGTTGAAAA ATCATGGCGC TTGAACGAAC 
GTCACTACGG TGGTTTGACT GGTAAAAACA AAGCTGAAGC TGCTGAACAA TTTGGTGATG 
AGCAAGTTCA CATCTGGCGT CGTTCATACG ATGTATTGCC TCCAAACATG GACCGTGATG 
ATGAGCACTC AGCTCACACA GACCGTCGTT ACGCTTCACT TGACGACTCA GTTATCCCAG 
ATGCTGAAAA CTTGAAAGTG ACTTTGGAAC GTGCTCTTCC ATTCTGGGAA GATAAAATCG 
CTCCAGCTCT TAAAGATGGT AAAAACGTAT TCGTAGGAGC TCACGGTAAC TCAATCCGTG 
CCCTTGTAAA ACACATCAAA GGTTTGTCAG ATGACGAGAT CATGGACGTG GAAATCCCTA 
ACTTCCCACC ATTGGTATTC GAATTCGACG AAAAATTGAA CGTCGTTTCT GAATACTACC 
TTGGAAAATA AAAAATTGTA AGTCTAGAAT TGATTTCTAG GCTTTTTATG TTAGTATGGA 
AGTATGATAA GGAATAAAAA ACAAGATTAT GTACTGGCCT ACAAGCAACC AGCTTCAACC 
ACTTACATGG GTTGGGAAGA AGAAGCTTTA CCGATAGGCA ATGGTTCTTT AGGAGCAAAA 
GTATTTGGCC TTATAGGGGC TGAACGGATT CAATTTAATG AAAAAAGTCT CTGGTCTGGA 
GGTCCACTTC CTGATAGTTC AGATTATCAG GGTGGAAATC TTCAGGATCA GTATGTTTTT 
TTAGCTGAGA TTOGGCAGGC TTTGGAGAAG AGAGATTACA ATCTGGCTAA GGAACTGGCT 
GAGCAGCACC TAATTGGGCC AAAAACGAGT CAATATGGGA CCTATCTGTC TTTTGGGGAT 
ATTCACATTG AGTTCAGCCA GCAAGGTACG ACTTTGTCTC AGGTGACGGA CTATCAGAGA 
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CAGCTGAATA 


TTAGTAAGGC 


ACTTGCGACG 


ACTTCTTATG 


TCTATAAGGG 


AACGCGATTT 


12660 


GAACGTAAAG 


CTTTTGCGAG 


TTTTCCAGAT 


GATCTCTTGG 


TTCAATGTTT 


TACTAAGGAA 


12720 


GGGTTGGAAA 


CTCTAGATTT 


TACTATAGAA 


CTATCCTTGA 


CCTGTGATTT 


GGCTTCTGAT 


12780 


GGAAAGTATG 


AGCAGGAAAA 


ATCTGATTAC 


AAGGAGTGTA 


AGTTGGATAT 


TACTGATTCT 


12840 


CATATCTTGA 


TGAAGGGAAG 


AGTTAAGGAT 


AATGATCTGC 


GGTTTGCTAG 


TTATCTAGCT 


12900 


TGGGAAACGG 


ATGGAGATAT 


TAGAGTTTGG 


TCAGATAGGG 


TTCAGATATC 


AGGAGCCAGT 


12960 


TATGCCAATC 


TCTTCTTGGC 


CGCTAAGACG 


GATTTTGCCC 


AAAATCCTGC 


TAGCAATTAT 


13020 


CGCAAGAAAC 


TAGATTTAGA 


GCAACAGGTG 


ATAGACTTGG 


TGGACACAGC 


TAAAGAAAAG 


13080 


GGCTATACCC 


AATTGAAATC 


AAGGCATATC 


GAGGACTACC 


AAGCCTTATT 


CCAGCGTGTT 


13140 


CAATTGGATT 


TGGAAGCTGA 


TGTTGACGCA 


TCCACTACAG 


ATGATTTGTT 


AAAAAATTAT 


13200 


AAGCCACAAG 


AAGGGCAGGC 


TTTGGAGGAG 


CTGTTCTTCC 


AGTATGGACG 


GTATTTATTG 


13260 


ATTAGTTCGT 


CCAGAGACTG 


CCCAGATGCT 


CTACCAGCTA 


ACCTACAGGG 


AGTCTGGAAT 


13320 


GCGGTCGACA ATCCTCCTTG 


GAATTCGGAC 


TATCACTTAA 


ATGTCAATCT 


GCAGCTGAAT 


13380 


TATTGGCCAG 


CCTATGTTAC 


CAATCTCCTA 


GAGACGGTCT 


TTCCAGTCAT 


CAACTATGTA 


13440 


GATGATTTGC 


GTGTCTATGG 


TCGTCTAGCG 


GCTGTAAAGT 


ATGCAGGAAT 


CGTCTCTCAG 


13500 


AAAGGTGAGG 


AGAATGGTTG 


GTTGGTTCAT ACTCAAGCGA CTCCCTTTGG 


TTGGACGGCA 


13560 


CCTGGTTGGG ATTACTATTG 


GGGTTGGTCA 


CCAGCTGCCA ATGCGTGGAT 


GATGCAAACC 


13620 


GTTTATGAAG 


CCTATTTATT 


TTATAGGGAC 


CAAGACTATC 


TCAGGGAGAA 


AATTTATCCC 


13680 


ATGTTGAGGG 


AAACGGTTCG 


TTTTTGGAAT 


GCCTTTTTAC 


ATAAGGATCA 


GCAGGCGCAG 


13740 


CGTTGGGTGT 


CTTCTCCGTC 


TTATTCCCCA 


GAACATGGGC 


CGATTTCGAT 


TGGCAATACC 


13800 


TATGACCAAT 


CTCTGATTTG 


GCAGTTATTT 


CATGATTTTA 


TTCAGGCTGC 


TCAGGAATTG 


13860 


GGACTGGATG 


AGGACTTGTT 


GACTGAGGTT 


AAGGAGAAGT 


CTGATTTACT 


AAATCCTTTG 


13920 


CAAATCACTC 


AATCTGGTCG 


AATCAGGGAG 


TGGTATGAGG 


AGGAAGAGCA 


GTATTTTCAA 


13980 


AATGAGAAAG 


TGGAGGCCCA 


GCATCGGCAC 


GCTTCCCATC 


TAGTGGGACT 


CTATCCTGGC 


14040 


AATCTCTTTA 


GCTACAAGGG 


ACAAGAGTAT 


ATTGAAGCGG 


CGCGTGCTAG 


CCTCAATGAT 


14100 


CGTGGAGATG 


GCGGCACAGG 


CTGGTCCAAG 


GCTAATAAGA 


TCAATCTCTG 


GGCGCGTTTG 


14160 


GGAGATGGCA 


ATCGAGCCCA 


TAAATTATTG 


GCAGAGCAGT 


TAAAGACATC 


caccttgcaa" 


14220 


AATCTTTGGT 


GTAGCCATCC 


TCCTTTTCAG 


ATAGATGGTA ATTTTGGTGC 


TACTAGTGGC 


14280 


ATGGCAGAAA 


TGTTACTCCA 


GTCTCATGCA 


GCTTATCTGG 


TACCTCTAGC TGCCCTACCT 


14340 


GATGCTTGGT 


CAACAGGTTC 


TGTTTCAGGC 


TTAATGGCAC 


GTGGACATTT 


TGAAGTGAGC 


14400 
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ATGAGCTGGG 


AAGATAAAAA 


ACTCTTACAG 


TTGACCATTT 


TATCAAGGAG 


Tf5fiArtrtArt.a<p 


1 A A cn 
1440U 


TTGCGAGTTT 


CTTATCCAGA 


TATTGAGAAG 


AGTGTGATTA 


AAATGAATCA 


AHAAAAAATA 




AAAGCGAAAT 


GCATGGGGAA AGATTGTATT 


TCGGTGGCAA 


CAGCAGAAGG 






CAATTTTATT 


TTTAAGAAGA TGTTATAAGG 


CAGTAATTTG 


AAACTGCCTT 


TTAAT AArtrt A 


1 J (LA f\ 


TTTAAGAATA 


TAAGCAGTTT 


TCAACTAGTT 


GAAAAAACGT 


TATAATGATA 


ATArtrtAartTa 


14 /UU 


ATACTCAATG 


AAAATCAAAG 


AGCACAAACT 


AGGAAG CTAG 




f"*TV A a A A f* A 


14 / 00 


TGTTTTGAGG 


TTGCAGATGG 


AAGCTGACGT 


GGTTTGAAGA 


Ci Af! ATTTTffi 

UftUrt 1111 


artrtar* trt a a 
ALrtjAla I Al AA 


14820 


TTTGTTTGAT 


AGAGGGTGGG 


TCTGATGGCT 


•PATATTY^ArtA 
I r\ 1 1\ L L \jr\\jr\ 


*TV2 aaara rtiv* 


1 1 At-AAGCGT 


14880 


TATCAGGTTG 


GGGACACGGA 


GATTGTGGCC 


A ATTY^TTi A"TV* 
nn I l\j I un 1\» 


iuAAl 1 1 IvjA 


GATTGAAAAG 


14940 


GGGGAGCTGG 


TTATTATCCT 


TGGTGCTTCA 




1 t- AAv-AIj i 


TCTTAACCtt 


15000 


CTTGGGGGAA 


TGGATACCAA 


TGATGAAGGG 


unnn 1 v. 1 vA? A 


TTGATGGTGT 


TAATATTGCG 


15060 


GATTATARTT 


CCCACCAGCG 


CACCAATTAC 


V-O i AuAAA lu 


A 1 Ls 1 i i 


TGTTTTTCAG 


15120 


TTTTATAATP 

X X X X *• A nil X V* 


TAGTTTCTAA 


TCTGACAGCT 


& Art/"* a a & inv 
AnuuAAAAlb 


1 vjVjAAL. i CjLjC 


TTCTGAAATT 


15180 


GTGACAGATG 


CCTTGAATCC 


TGATCAGGCC 




1 HWj 1 V- 1 VjvjL. 


TCATCGTCTC 


15240 


AATAACTTTP 


CAGCCCAGCT 


TTCTGGAGGG 




«Au I V. 1 A 1 


TGCACGCGCG 


15300 


GTAGCCAAAA 


ATCCTAAAAT 


TCTCCTTTGT 


rtATrtAACP RA 


\- i vjVj/\ O V- V- I 1 




15360 


ACGGGCAAGC 


AGGTTTTGAA 


AATTCTCCAA 


GACATGTCTC 


GTCAAAAGGG. 


ArtPrtarvirtTrt 

rV v3 v. vj/\v». vjvj l vj 


1 Oft 


ATCATCGTGA 


CTCATAATGG 


AGCTTTGGCG 


CCCATTGCTG 


ATCGCGTGAT 


TCAAATHTAP 




GATGCCAGTG 


TCAAGGATGT 


GGTGCTCAAC 


CAGCATCCTC 


AGGATATTGA 


CAGTTTGGAG 


15540 


TACTAGCATG 


ATCAAGCGAA 


AAACTTATTG 


GAAGGACTTA 


GTTCAGTCCT 


TCACAGGCTC 


15600 


v-AAvsUvjOCGT 


TTTTTATCCA 


TCTTGATCCT 


GATGATGTTG 


GGATCTCTAG 


CCTTAGTAGG 


15660 


CCTCAAAGTA 


ACCAGTCCCA 


ACATGGAGGC 


GACAGCTAAT 


GCTTATTTAA 


CAACTGCTCA 


15720 


AACCTTGGAT 


TTGGCAGTCA 


TGTCTAACTA 


TGGCTTGGAT 


CAAGCAGACC 


AAGAAGAACT 


15780 


AAAACAGACG 


GAGGGCGCAG 


AGGTCGAGTT 


TGGCTATTTG 


ACAGATGTGA 


CTATGGATAA 


15840 


TGGGCAGGAT 


GCCATTCGGC 


TGTACTCCAA 


ACCAGAGCGA 


ATTTCAACCT 


TTCAGCTAAG 


15900 


AAAGGGACGA 


CTTCCTCAGT CAGACAAGGA AATCGCTTTG 


GCCACTCATT 


TGCAAGGCCA 


15960 


ATACAGCGTG 


GGACAGGAGA TTAGTTTTAA 


AGAAAAAGAA 


GAGGGTCATT 


CCTCTTTAAA 


16020 


AGACCATACT 


TATACCATTA 


CTGGTTTTGT 


GGATTCGGCT 


GAAATCCTCT 


CCCAGCGAGA 


16080 


TATGGGCTAC 


GCAGGAAGTG 


GAAGTGGGAC 


TCTGACAGCC 


TATGGGGTGA 


TTTTACCTAG 


16140 
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TCAATTTGAT 


CAGAAAGTCT 


ACAATATAGC 


TCGTTTGAAA 


TATCAAGATT 


TAGCGGGTTT 


15200 


AAATGCCTTT 


TCATCAGCTT 


ATGAAGAAAA 


ATCCAAGCAA 


CATCAAGAAG 


AGCTTGAACA 


16260 


AATTTTATCA 


GATAATGGCA 


AGGTACGTCT 


GCAACTTTTG 


AAAAAAGAAG 


GACAAGAGTC 


15320 


TCTAGACAAG 


GGGCAAGAGA 


CCCTTGACAA 


GGCTCAGACT 


AATTTGCAGG 


AAGGCAAGCG 


16380 


TCGTTTAGCA 


GCTGCTCAAG 


CTCGTATACA 


GGCTCAAGAA 


AGTCAACTAG 


CCTTGTTTCC 


16440 


TCAAGTTCAG 


AGAGAGCAGG 


CTAGTGCTCA ACTTACCCAA 


GCCAAGCAGG 


AATTGGGCAA 


16500 


GGAAGAGGAC 


AAACTAAAGC 


AAGCTGAACA 


AAATCTAGCC 


CAAGAAAAGG 


AAAAATTAGA 


16560 


AAAACATCAG 


CAAGTCTTGG 


ATGATTTGGC 


GGAGCCAAGG 


TATCAGGTTT 


ATAATCGTCA 


16620 


GACCATGCCA 


GGTGGTCAGG 


GCTATCTTAT 


GTATAGCAAT 


GCTTCATCCA 


GTATTCGAGC 


16680 


AGTGGGCAAT 


ATCTTTCCTG 


TGGTACTTTA 


TGCCGTAGCA 


GCCATGGTGA 


CCTTTACGAC 


16740 


CATGACTCGC 


TTTGTAGACG 


AAGAGCGAAC 


TCATGCAGGG 


ATTTTTAAGG 


CCTTGGGTTA 


16800 


TCGTAGTAAG 


GATATTATCG 


CCAAGTTTCT 


CCTTTATGGA 


CTAGTAGCTG 


GGACTGTCGG 


16860 


AACGGCTCTA 


GGTAGTATAC 


TTGGTCATTA 


TTTGCTAGCC 


AGTGTAATTT 


CAAGTGTCAT 


16920 


TACAAAAGGC 


ATGGTGGTGG 


GAGAAACTCA 


GATTCAGTTC 


TATTGGACCT 


ATAGCTTACT 


16980 


AGCTTTTGTC 


TTGAGCTTGT 


TGGCGAGTGT 


GTTACCAGCC 


TATCTGGTGG 


CTTGGAGGGA 


17040 


ACTTCATGAC 


GAAGCAGCCC 


AGCTTCTACT 


TCCTAAACCT 


CCTGTCAAAG 


GAGCl'AAAAT 


17100 


CTTATTGGAG 


CGTATCGGTT 


TTATCTGGCG 


TCGTCTCAGT 


TTTACTCATA 


AGGTAACAGC 


17160 


CCGCAACATC 


TTTCGTTATA 


AGCAGAGAAT 


GTTGATGACA 


ATCTTTGGTG 


TGGCAGGTTC 


17220 


TGTAGCTCTG 


CTCTTTGCAG 


GTTTGGGAAT 


CCAATCTTCT 


GTAGCAGGAG 


TTCCGTCTAA 


17280 


ACAGTTTCAA 


CAAATCCAAC 


AGTATCAGAT 


GCTTGTCTCT 


GAAAATCCTA 


GTGCGACCAA 


17340 


TCAGGACAAG 


GTAGAGCTAG 


CAGAAGTGTT 


GAAAGGGCAG 


GAGATACTAG 


CCTACCAGAA 


17400 


AATCTATTCT 


AAAGCGCTAT 


ACAAGGATTT 


CAAAGGCAAA 


GCTGGTCTTC 


AAAACATTAC 


17460 


TCTTATGATG 


ATAGAGAAGG 


AAGATTTGAC 


TCCCTTTATC 


CATCTTCAAC 


ATCATCAGCA 


17520 


GGAGCTGACA 


TTAAAAGATG 


GCATCGTTAT 


TACAGCTAAA 


CTCGCCCAGC 


TGGCAGGTGT 


.17580 


CAAGGTTGGG 


CAGACTTTAG 


AAATTGAAGG 


TAAGGAACTA 


AAGGTCGTTG 


CTATTACTGA 


17640 


GAACTACGTT 


GGTCACTTTA 


TTTATATGAG 


TCAGGCTAGC 


TATGAGCAAC 


TTTACGGACA 


17700 


GCTACCCCAA 


GCCAACACTT 


ATCTGGTCTC 


ATTAAGGGAT 


ACCAGTGCAA 


CTAGTATCGA " 


17760 


AAGTCAGGCG 


GGCTTGCTTA 


TGAATCAATC 


TGCGGTGTCC 


AGCGTTGTCC 


AAAATGCTTC 


17820 


AGCCATTCGA 


CTCTTCGACT 


CTATCGCTAG 


CTCACTCAAT 


CAGACCATGA 


CCATCTTGGT 


17880 


CATCGTATCG 


GTTCTATTAG 


CTATTGTCAT 


CCTTTACAAT 


CTGACCAATA 


TCAACGTAGC 


17940 
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TGAGAGAATC 


CGTGAACTCT 


CCACTATCAA 


GGTTCTTGGT 


TTTCATA AT A 


A rLxAAlv 1 LAv. 


18000 


CCTCTACATT TACCGTGAGA 


CGATTGTGCT 


GTCCCTTGTG 


GGAATCGTAC 


TTGGTPTY2AT 


louoU 


AGCTGGTTTC 


TATTTACACC 


AATTTTTGAT 


TCAAATGATT 


TCGCCTGOGA 


PTATTrTPTT 


lol-U 


TTATCCGCAG 


GTAGGCTGGG 


AAGTCTATGT 


AATCCCAGTG 


GCAGCAGTAA 


V-\-*»X\-A_ 111 


laXoyj 


GACCTTGCTT 


GGTTTCTTCG 


TCAATTATTA 


TCTGAGAAAG 


GTTGATATGT 


T AG A IkfirPPT 




GAAATCTGTA 


GAGTAAGGTA 


GTTATTTTTA 


GCTGATTGAA 


A .L V-» A ^\ XXI r\ 


V* 1 MA 1 A 1 1 V_A 


18300 


AAAATCCTCC 


GTTTCAAAGA 


GCAGGGAACT 


CTTTGTGACA 


GAGGATTTTT 
unvjvmi a x x i 


t p t a t & cirinf* 


18360 


TTTAGCAGCT 


GCAATTGCGG 


CTTCGAAGTT 


TGGCTCAGAA 


TTG A T A TT A T 


ppappTaTnv 

V_V.AL.Vj 1 A 1 IV 


18420 


AACGTAGCGA 


ATCGTATTGT 


CARTATCfiAfi 


GAP A A AH APT 


gpgpptw^t 2\ 

w-VJi-V* 1\jV_ 1 A 


A 1 AWjTGCCA 


18480 


TTCGTTGATC 


AAGAGGGCAT 


Art 1 VUvVj^-^.^ 


Va/UUiunnluu 


1 l_ AAAVj 1 Alj 1 


L I vi AAAGC AT 


18540 


AATGGCATTG 


TCAAGGCCTT 




LLAALul 111 


1 vjnbLAAAAb 


GTAGGTCCAT 


18600 


TGAAACAGTC 


AATACGACCG 


rp/v|wTip m r* ri » 
J Vj 1 Ibi LLAu 


1 V_t_Alj<_(_ AA A 


TCTTCATTAA 


AACGACGTGT 


18660 


TTGAGTTGAG 


CAGATGCCTG 


TATPfiATAfiA 
1 A lVun 1 AuA 


AV-\jAAI_IjAI.A 




TTTTCTTGCC 


18720 


ATCAAAATCA GCCAGAGATT 


TTTT Afi A A Af2 
ill k AVaAAAVj* 


ftlv. i vj 1 1 v> 1 A 


Va I AAvjAVjAAA 


AATCAAGCGC 


18780 


CTTGTCGCCG 


ACTTGTAGTT 


r 1 m 1 w 1 ■ fi PpnV^ 

llllt nv^> 1 Vj 1 


nAAVjL 1 V- AL. A 


tabATTTCCGA 


GAAAAGTTAC 


18840 


CATAGGATAC 


TCCAATCTTT 




TTTAGorna & 

111 nVA. luAA 


nL Avj 1 VJVjtjxAA 


TTTTCCAATG 


18900 


ATTTGACCGG 


AAATATGGGC 


ATAGAAAAAA 


CGCCAGCTCA 


TGTG A & & TCI 
x \9 lununn 1 o 


AV^llTTTCA 


18960 


TAGGTTTATT 


TTGCCAATCC 


TTCAGCAATC 


TTGTCAAGGT 


TGTATTT{"*AT 


V-A 1 VjV_ 1 Vj 1 A\3 


1 q n O A 


TAGCTGTCGC 


CTTCTTTACC 


TTGTTCTGCG 


ATAGAGTCAG 


TAAAGATTTG 


AGCGT AG ATT 


1 onon 


GGGATGTTTG 


TGTCTTGAGA 


AACAGTTTTC 


ATTGGACGGT 


CATCCACACT 


TGATTCTACA 


1714U 


AAGAGTGATG 


GAACTTTTGT 


TTGGCGAAGT 


TTTTCAACCA 


AGGTCTTGAT 


TTGTTCAGGA 


19200 


GTTCCTTCTT 


CTTCAGTATT 


GATTTCCCAG 


ATGTAAGCAC 


TTGGGACACC 


ATAGGCTTTA 


19260 


GAGAAGTATT 


TGAATGCTCC 


TTCGCTGGTT 


ACAATGAGTT 


TCTTTTCAGC 


AGGGATCTTA 


19320 


TTAAATTTAT 


CCTTACTTTC 


TTTATCAAGT 


TTGTCTAACT 


TATCAGTATA 


TTCTTTGAGA 


19380 


TTTTTTTCAT 


AGAATTCTTT 


ATTGTTAGGG 


TCTTTGGCGC 


TCAATTGTTT 


GGCGATATTT 


19440 


TTAGCAAAAA 


TAATACCGTT 


TTCAAGGTTA 


AGCCAAGCGT 


GTGGGTCTTC 


TTTTCCTTTT 


19500 


TCATTTTGAC 


CTTCAAGGTA 


GATAACATCA ACGCCGTCGC 


TGACTGCGAA 


GTAGTCTTTG 


19560 


TTTTCAGTTT 


TCTTGGCATT 


TTCTACCAAT 


TTTGTAAACC 


AAGCATTGCC 


ACCTGTTTCA 


19620 


AGGTTGATAC 


CGTTATAGAA AATCAAATTA GCCTCAGAAG TTTTCTTAAC 


GTCTTCAGGA 


19680 
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AGTGGTTCGT 


ATTCGTGTGG 


GTCTTGCCCA 


ATCGGAACGA 


TACTATGAAG 


GTCAATTTTG 


19740 


TCACCAGCAA 


TATTTTTAGT 


AATATCAGCG 


ATGATTGAGT 


TTGTAGCAAC 


AACTTTTAGT 


19800 


TTTTGACCAG 


AAGTTGTATC 


TTTTTTTCCG 


CTAGCACATG 


CTACAAGAAT 


GATTGCAGAA 


19860 


AGAAAGAGAA 


CGAGTAATGT 


ACCTAATTTT 


TTCATTAGAT 


CCTCCAATTT 


ATTAGGGCTT 


19920 


TGCCCCTTAT 


TTTAACAAAT 


GTTTATTTTT 


CAGTTTCAAA 


TATCGTTGTT 


TGGGAGCGAT 


19980 


AAAGAAGCTA 


ATGAGAAAGA 


AACTAGCAGC 


TGTAAGCACG 


ATACTAGAAC 


CTGCCGCAAC 


20040 


ATTAAAACTA 


TAGCCAATAA 


AGAGTCCCAA 


AACTGAAGCA 


GTAGCTCCGA 


AGGTTGAGGA 


20100 


AAGGAAAATC 


ATACTTTTCA 


G ACT ATT AG C 


ATACAGATAA 


GCAGTTGCAG 


CTGGGGTAAT 


20160 


CAGCATGGCT 


ACAATCAGGA 


TAGTTCCGAC 


ACTTTGCATG 


GCTGTCACAG 


ACACGAGAGT 


20220 


CAGGAGTACC 


ATGAGAAGGT 


AGTGATAGAA 


ATTGACAGGC 


ATTCCCATGG 


CTTTAGCCAA 


20280 


GAGTTCATCA 


AAGGAAGTTA 


TCAAGAGTTG 


CTTGAAGAAA 


ATCCAGATTA 


ACAAGAGGAT 


20340 


AGCTGCCCCC 


ACACCCATAG 


TAATAAACAT 


ATCCGTATCT 


TGGACGGCCA 


GGATATTACC 


20400 


AAAAAGGATA 


TGGAAAAGGT 


CAGTTGAACT 


TTTAGCGACA 


CCAATCAAGA 


TGATACCGAG 


20460 


GGCTAAGAAA 


GAAGAAAAGG 


TAATGCCGAT 


GGCGGTATCG 


CTTTTGATAA 


TCGAGTTTCC 


20520 


TTTGATGTAG 


GTAATGATGA 


TGGCAGCTAG 


CAATCCAAAG 


ACAATGGCTC 


CGATAAAGAA 


20580 


GTCAAGGCCC 


AAGATGAAGG 


ATAGGGCTAC 


ACCTGGTAAG 


ACAGCATGTG 


AAATGGCATC 


20640 


TCCCATGAGT 


GACATCCCGC 


GTAGAATAAT 


GAAACATCCC 


ACAGCTCCAG 


CTACAATCCC 


20700 


GACGACAATA 


GCTGTTATCA 


AGGCATTTTG 


TAGGAAATGG 


AATTTTTGCA 


ATCCATCGAT 


20760 


AAATTCTGCA 


ATCATAGGTC 


ACCTCCATTG 


AAAAAGAGTT 


GATTACCGTA 


AGCTTCTTTT 


20820 


AGATTGGTTT 


CGGTAAAAGT 


TTCTTTTGTT 


GGACCAAAGG 


CAATCACTTC 


TCGATTGACA 


20880 


AGTAAGACTT 


GATCGAAGTA 


GTGGGGAATC 


TTGCTGAGGT 


CGTGGTGAAC 


GATGAGAACC 


20940 


GTCTTCCCAG 


CTTTTTTCAA 


ATCTCTCAGC 


GTATTCATGA 


TGATTTCCTC 


ACTGACAGAG 


21000 


TCAATCCCAG 


CAAAGGGTTC 


ATCCAAGAGG 


ATATAGTCGG 


CTTCCTGCAC 


CAAACATCTG 


21060 


GCAATCAAGA 


CCCGCTGGAA 


TTGACCTCCA 


GACAGTTGAC 


TAATTTGACG 


TTCAGCGTAG 


21120 


TCAGCTAGGC 


CGACGATTTC 


AAGGGCCTCT 


TGCACTTTCT 


TCCAATGTTT 


AGCCTTTAAA 


21180 


CTTCGAAAGA 


GAGGAATAGA 


GGGAAATAGT 


CCTAACGAGA 


CGCATTCCTT 


GACCTTGATG 


21240 


GGAAAGTTGT 


AGTCGATATT 


GATTTTTTGT 


TCGACATAGG 


CAATTCGGTG 


TAAGGATTTT 


21300 


TTAACTTCCT 


TGTCATCGAG 


AAATGCCTGA 


CCTTGATGTG 


GGATAATTCC 


CAACATACCT 


21360 


TTTAATAGTG 


TTGATTTCCC AGCGCCGTTT GGACCAATGA TGCCGGTAAT TGTTGGTCCA 


21420 


TGGAGCACTA 


GTGAAATATC 


CTTAAGTGCC 


AACGTTTCTT 


TGTAGGAGAC 


ACTGAGGTTT 


21480 
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TCGATACGTA TCATAAACTT GTATTCCTCC TGTCTCTTAA TATACATTAA AAAAAAAATT 21540 

AAGTCAAGTT AATTTTTGAA AAAATTAAAA TAATAACTGA AAAATAGATT CTAAAGATAA 21600 

CTTTCAGGAT AAATTTCTAA ATTATAAAAC GCATAGTATC AAGTGTAAAA AACTTGGAAT 21660 

TATGCGTTTT ATCATGGAAA GATTTTTTAT AATAGCTAAA AAATAA 21706 
(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6171 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

GATCCCCAGG AAAAACCGAG GTTTTCCCAA TCAATCGTTA CTGTCATATT CCACTCCTTA 60 

TTCTAAAAAC CTATTTCTTA TATTCTACAC TATTTTTCTA AAATAGCAAG TATATTTTGT 120 

AATTTTCAGA AAATTTCTCC AATAAAAACC AACTCTTAGA ACTGATTCTT CATTTCACTT 180 

ATTTATCTTC AGTAACTACT TCCTGAAGAT AAGCGTCAAA AACTTCTTCA TCTGAAATCG 240 

TGTCAGAAAT GAAGCTTCCA TTGCTAGTGC GTTCTGACAA GTTCAAGTCT TGCAATCGGC 300 

TTTCATAGAT TGTTCCTTTA TTGGATTGGA CAAGCAGAGT TTGGTCGTTC ACATCCACTT 360 

CCGTACTGAA GAAATCGCCA ACAAATCCTT GCTCTGCAAC TGCTCCTGCC AAGAAGACAC 420 

GATGCGGTTT GTTTTTCAAC TCACGCAAGA CTTGTAATCC TCGTTTGGCA CGGCTGGTTG 480 

CTAGAATTTC CTCAATGGAA ACACGTTTCA AGCTTCCACG CTGGGTCAAG AGGTAGAAGG 540 

ACGAAGTATT ACAGATAAAG CCAGATTGGA GGACATCATC TTCTTTCAAA TTCATAGCCT 600 

TGACACCTGC TGCCTTAGCA CCGACAACCG GAACCTCTTC GATATTGAAA CGCAGGGCAT 660 

AACCATTTTG ACTAACCAAG ACAACATCAT CTAGTTTAAT CGGAGCCACT GCTACAATCT 720 

GATCTGTATC GTCTTTGAGC TTAGCATACT TGACAGACTT AGATCTATAG GTCCGCCATG 780 

GAGTGAATTC TTTTCGCTCT ACCCGTTTGA TTTGACCAAG GCGAGTCACT GCAAAGTAGG 840 

TTGTCGCATC GTCAAACTGA TCCAGTACTT CCACATAAAG GATTTCTTCA TTCGTTTCAA 900 

AGTTTGTGAT GGTTTGGCTC AGATGCTCTC CGATGTCCTT CCAACGAATA TCTGCCAACT 960 

CATGGATTGG TCTGTAGATG ACATTTCCAA GACTTGTGAA CATCAAGAGG TGCTGGGTTG 1020 

TCTTGGCAGA TTGAACAAAA ATCAAACGGT CATCATCACG CTTGCCAATT TCTTCCAAGG 1080 

TGGAAGCCGC AAAGGAACGT GGACTGGTAC GCTTGATGTA ACCTGCCTTG GTCACGCTGA 1140 
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CGTAGGTATC 


TTCCTCAGCG 


ATAAGACTAG 


CTGTATCAAT 


CTCAATTGCT 


TTCGCAGTGT 


1200 


CTTCTAAAGA 


ACTCAAACGA 


GGAGTTGCAA 


ATTTCTTCTT 


GACCTCACGA 


AGTTCTTTCT 


1260 


TCATGAGATT 


GTACATAGTC 


CTTTCATCAC 


CGATAATAGC 


CGCCAGCATA 


GCAATCTTCT 


1320 


CACGAAGCTC 


TGCTTCTTCT 


TCCTGCAAGA 


CAACCACATC 


GGTATTGGTC 


AAACGGTACA 


1380 


GTTGCAAAGT 


TACGATAGCC 


TCAGCCTGTT 


CTTCCGTAAA 


ATCATAGCTA 


ACTTTGAGGT 


1440 


TTTCCTTGGC 


GTCCGCCTTA 


TTCTCAGAAG 


CACGGATAAG 


AGCAATGACT 


TCATCCAAAA 


1500 


TCGAAATCAC 


ACGAATCAAA 


CCTTCGACGA 


TATGGAGACG 


TTTCTCAGCC 


TTTTCTTTGT 


1560 


CAAAGCGTGA 


ACGCGCCAAA 


ATCACTTCTC 


GACGGTGAGC 


GATATAGCTA 


GACAGGATTG 


1620 


GAACAATCCC 


AACCTGACGA 


GGTGTGAAAT 


TGTCAATCGC 


CACCATATTA 


AAGTTGTAGT 


1680 


TGATTTGTAG 


GTCGGTGTAC 


TTAAATAAGT 


AGTTGAGAAC 


AAGCTCAGTA 


TTAGCGTCTT 


1740 


TCTTAAGTTC 


GATAGCGATA 


CGAAGACCAT 


CACGGTCAGA 


CTCATCACGA 


ACCTCAGCAA 


1800 


TCCCAGCTAC 


CTTGTTATTA 


ACACGAACAT 


CATCGATTTT 


CTTGACTAGA 


TTGGCCTTAT 


1860 


TGATTTCATA 


AGGAATCTCA 


ATAATAACGA 


TTTGTTCCTT 


ACCACCTTTT 


AGCTTTTCAA 


1920 


TTTCAGTCTT 


GGAACGAACA 


ACCACGCGCC 


CTTTCCCAGT 


CTCATAAGCT 


TTCTTGATTT 


1980 


CATCACGACC 


CTGAATAATA 


GCCCCTGTAG 


GGAAGTCTGG 


TCCAGGCAAG 


AATTCCATGA 


2040 


GTTTATCAAT 


CTTTGCAGTT 


GGGTGGTCAA 


TCATGTAAAC 


TGCAGCATCT 


ATGACCTCAG 


2100 


CTAAATTATG 


GGGAGGAATG 


TCTGTGGCAT 


AACCAGCCGA AATCCCAGTC 


GAACCATTGA 


2160 


CCAAGAGGTT 


TGGAAAGGCT 


GCTGGCAAGA 


CCGTTGGTTC 


TTTCTCCGTA 


TCGTCAAAGT 


2220 


TCCATGCAAA 


AGGAACTGTC 


TTTTTCTCGA 


TATCCTGAAG 


AAGGTAGCCT 


GCAATTTCAG 


2280 


ACAAACGTGC 


CTCAGTATAA 


CGCATAGCCG 


CAGGAGGATC 


TCCGTCCATA 


GAACCGTTAT 


2340 


TACCGTGCAT 


TTCAACTAGA 


ATCTCACGAT 


TTTTCCAGTT 


CTGTGACATA 


CGAACCATGG 


2400 


CATCATAGAT 


AGAAGAATCC 


CCGTGTGGGT 


GGAAATTCCC 


CATGATGTTC 


CCGACTGACT 


2460 


TGGCCGACTT 


ACGGTAGCTC 


TTGTCAAAAG 


TATTGCTATC 


CTTATTCATA 


GAATAAAGAA 


2520 


TACGGCGCTG 


AACCGGCTTC 


AACCCATCAC 


GAATATCTGG 


CAAAGCCCGG 


TCTTGAATAA 


2580 


TGTACTTGGA 


GTAGCGACCA 


AAGCGCTCTC 


CCATGATGTC 


CTCCAGGGAC 


ATGTTTTGAA 


2640 


TGTTAGACAT 


AAGATACAAA 


GCCCATAAAA 


TACCAAGTGA 


AAATAGAAAA 


TTCTTGAAGT 


2700 


AAGCAAACTC 


ACAAGAGAAT 


TTATCTTTTT 


CACACAGTAT 


CTAGGGCGTG 


TTCAACTCCT " 


2760 


TTCAAAGAAT 


GTAGAGTAGG 


TTTTTATGCA 


GTAAAAGATA 


TTTTACGGGA 


ATTCCTCCCG 


2820 


TGTTCAGTTA 


CGATAAGTAA 


CCAAACTATC 


CTGTTTGTAT 


TTTTCAATAT 


GAAAATCTGG 


2880 


TTTTCCAAAA 


TTAGTCTTAG 


TTTGTGTCTT 


AGCCGCTCCC 


TTAAGCGCCT 


CTTTGAGATA 


2940 
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AGCACTCATA 


Gr'AGATTT'TT 


PlTTl AT A AT 
LAI 1AA1AA1 


LL lbLAA ill 


ftWtWT*/^ AAA /T* A 

1 I IlaAAllA 


AGATTTTCAA 


3000 


AfrnoTT'P'P'p 


l-HLA 1 r\\j 1 v_ A 


mmp apj mppp 
1 it ALA rLLG 


ACTCTAATTT 


lCAGTTTACT 


AACATATTAT 


3060 


f r VTC t t v P r Pr* A T 
a i i v- i l iv. r\ i 


X AAAAUAL 1 VJ 


TCGTTTCTTC 


1 AL-Lb 1 AAAL 


TTGACATTAT 


CTTCAATCCA 


3120 


TTT ACGG CGT 


V*L* 1 ILI ALL 1 


1A1L1LLLA1 


V)AIjAALA1 l\j 


AlGCGGCGTT 


CGGCGCGCGC 


3180 


TAAATCTTCA 




RfSATGAGRGT 


nLVJlvj 1 1 1 L 1 




TTGTTTCCCA 


3240 


GAGCTGGTep 




LnLLAnu ILL 


TTTGTATCGT 


TGGAGGGTAG 


CGCCTTTACC 


3300 


GAACTGTTTA 
w.rircv* i\ji i in 


LuuAVj 1 1L i 1 


L 1 Au 1 1L ILL 


G I LLL» 1 LLAA 


GlGTAGGllA 


CTTCTTCTTT 


3360 




rnTTYtrt r n 

^Ll J luuA^A 


1 L 1 1 Vj 1 AAAG 


Auu i TjVjV>AVj*j 


GCAATATAGA 


CATGACCTGC 


3420 


PTPfi ii rr a fir* 


CIC R fV^/"* R TV* T 


a a i"vv*»T»A/~* a a 
AALUi 1 AGAA 


AAA 1 L? 1 lAAG 


AGCAAGGTCT 


GGATATGGGC 


3480 




1 LL\>LArLGG 


I t-A TGATAAT 


GATlTTATCA 


TAGTTGGCAT 


CTTCAATAGA 


3540 


\3r\r\\j l V* 1 \jL 1 


LLAAlAlllG 


lAl. lAA 1 I 


AIAAAfCATG 


GTATTGATCT 


CTTCATTTTT 


3600 


uALjVjA I A I LL 


Glla IvTTGO 


CCTTGGCTGT 


ATTGAlAAll 


TTACCACGAA 


GAGGTAGAAT 


3660 


ArTVT/TPTA R /"> 
AVjLL IuuaAL 


I IGlGajTlAC 


GACCTTGTTT 


GGCAGAACCA 


CCGGCAGAGT 


CCCCCTCAAC 


3720 


t ap A T & c a r*rn 
1 AviA 1 AuAu 1 


1 LA rTCTTAG 


CAGGATTCTT 


AGaTTGGGCT 


GGGGTCAATT 


TCCCAGACAA 


3780 


ch. RfT*/" , f* r PiT | 7\ 
tAflbLLL i 1A 


TCTTTCTTGT 


TTTTCTTCCC 


atttcggctc 


TCATCACGCG 


CCTTACGTGC 


3840 


Jul. 1 1 LALbA 


GLA rvJACGGG 


CCTTGaTaGl 


cttgcggatg 


AGGTTAGAAG 


CTAATTCCCC 


3900 


l»|H|»iwi(pp TAT* A 


a p»/~» /"•/"■ 
nbUAAAAAvju 


JrLAALi iATL 


a r*fr* a nnt a nvn 

AGCCACTATT 


CCATCCACAA 


CTGGGCGAGC 


3960 




LL 1 AG 1 i 1AX 


LL i J/GG f L 1 V3 


TCCTTCAAAC 


TGCAAGTGTT 


CTTCAGGAAC 


4020 


TAAGATAGAA 




r*T & P-TPPPTP 
L 1 AVj 1 LLL 1 L 


ACGATAGTCT 


GAACCTTCAA 


GGTTTTTATC 


4080 


TTTTTCCTTG 


AGAAGACCTG 


TTTT A OfiTG C 


ATAGTCATTC 


ATGACCTTGG 


T* R A rpp^ A /*• A 

1 AA IXiLsLAGA 


4140 


CTTGAGTCCT 


GTCTCGTGCG 


TTCCACCGTC 


CTTGGTGCGA 


ACGTTATTGA 


OtAAAGATAG 


4200 


AATGTTATCT 


GAGAATCCGT 


CATTGTACTG 


GAGGGCTACT 


TCCACTTGAA 


AACCATTGTC 


4260 


TTCCCCTTCA 


AAGTAAAGAA 


CTGGCGTCAA 


GATTTCCTTA 


TCTTCGTTGA 


GATAAGAAAC 


4320 


AAAATCTTGT 


ACTCCATTCT 


CATAGTGGAA 


CTCAATCGCT 


TCATTTGTTC 


GCTTGTCCGT 


4380 


TAAAGACAAG 


GTCACATTTT 


TCAAGAGAAA 


GGCTGATTCA 


TTAAGGCGCT 


CTGAAATGGT 


4440 


ATTGTACTTG 


AAATCTGTCG 


TAGAAAATAT 


AGTCGCGTCA GGCATAAAAG 


TAACTTTGGT 


4500 


GCCTGTTTTA 


GACTTGGGTG 


CTGTACCGAT 


TTTCTTCAAA 


GTCGTGACAG 


GTTTTCCACC 


4560 


ATTTTCGAAA 


CGTTGCTTGT 


AAACTGCGCC 


ATCACGGGTA ATTTCAACTT 


CTAACCAGCT 


4620 


AGAAAGGGCG 


TTAACAACGG 


AAGAACCCAC 


TCCGTGAAGT 


CCACCTGATG 


TCTTATAGCC 


4680 
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ACCTTGACCG 


AATTTCCCTC 


CGGCATGAAG 


AATGGTAAAG 


ATAACCTCAA 


CAGTTGGAAT 


4740 


TCCCATAGCG 


TGCATACcTG 


TCGGCATCCC 


ACGTCCATGG 


TCTTGAACCG 


TTAGACTACC 


4800 


GTCTTTATTG 


ATAGTTACAT 


CAATACGATC 


ACCAAACCCA 


GACAAGGCTT 


CATCGACTGC 


4860 


ATTATCAACG 


ATTTCCCAAA 


CTAGGTGATG 


AAGACCAGCG 


CCATCGGTCG 


ATCCAATATA 


4920 


CATCCCTGGA 


CGTTTTCGGA 


CCGCATCCAA 


CCCTTCTAGC 


ACCTGAATAG 


CATCATCATT 


4980 


ATAATTGTTA 


ATATTGATTT 


CCTTTTTTGA 


CACAAGGAAC 


CTCCTATTCG TTCATCTTTA 


5040 


CTATTCTACA 


GGTTTTCCAA 


GGATTTTGCA AAATTTTTCT 


TTCTCCGATG 


TGACAATTTC 


5100 


AGCAGAGATT 


CTCTGCTTTT 


CTTTCCCAAT 


TCATGATATA 


ATAGGAGTAT 


GATTACAATA 


5160 


GTTTTATTAA 


TCCTAGCCTA 


TCTGCTGGGT 


TCGATTCCAT 


CTGGTCTCTG 


GATTGGACAA 


5220 


GTATTCTTTC 


AAATCAATCT 


ACGCGAGCAT 


GGTTCTGGTA 


ACACTGGAAC 


GACCAACACC 


5280 


TTCCGCATTT 


TAGGTAAGAA 


AGCTGGTATG 


GCAACCTTTG 


TGATTGACTT 


TTTCAAAGGA 


5340 


ACCCTAGCAA 


CGCTGCTTCC 


GATTATTTTT 


CATCTACAAG 


GCGTTTCTCC 


TCTCATCTTT 


5400 


GGACTTTTGG 


CTGTTATCGG 


CCATACCTTC 


CCTATCTTTG 


CAGGATTTAA 


AGGTGGTAAG 


5460 


GCTGTCGCAA 


CCAGTGCTGG 


AGTGATTTTC 


GGATTTGCGC 


CTATCTTCTG 


TCTCTACCTT 


5520 


GCGATTATCT 


TCTTTGGAGC 


TCTCTATCTT 


GGCAGTATGA 


TTTCACTGTC 


TAGTGTCACA 


5580 


GCATCGATTG 


CGGCTGTTAT 


CGGGGTTCTG 


CTCTTTCCAC 


TTTTTGGTTT 


TATCCTGAGT 


5640 


AACTATGACT 


CTCTCTTCAT 


CGCTATTATC 


TTAGCACTTG 


CTAGTTTGAT 


TATCATTCGT 


5700 


CATAAGGACA 


ATATAGCTCG 


TATCAAAAAT 


AAAACTGAAA 


ATTTGGTCCC 


TTGGGGATTG 


5760 


AACCTAACCC 


ATCAAGATCC 


TAAAAAATAA 


AATGCCAGTT 


CTGTACTGCC 


CCCAAACAGT 


5820 


TAGACAAATA 


ATTTATCCAA 


AGGATTTAGT 


TCTGTACTGC 


ACAGGACTAA 


GTCCTTTTAG 


5880 


TTTTACCTTA 


ATTCGTTTGT 


TGTTGTAGTA ATCAATATAG 


TCTATAATGG 


CTTGTTCCAA 


5940 


TTGATTAAGT 


GATTTAAATG 


TTTTCTCATA 


GCCATAAAAC 


ATTTCGGATT 


TTAAAATGCC 


6000 


AAAGAAAGAT 


TCCATCCTAC 


CGTTGTCTTG 


GCTGTTGCCC 


TTACGTGACA 


TGGATGCTTG 


6060 


AATTCCCTTA 


CTCTCTAGGA 


ACCGATGATA 


AGAATCGTGT 


TGGTATTGCC 


AGCCTTGGTC 


6120 


ACTATGGAGA 


ATCGTATTCT 


CGTAGTGCTT 


CTCTGTGAAT 


GCCTGTTCCA A 


6171 


(2) INFORMATION FOR SEQ ID NO: 38: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18475 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

TATTACAAAT AAAAAAACGG AGGAGTGCTT TATGAAAGCC TATACTTATG TTAAACCAGC 60 

ACTTGCTTCT TTTGTTGATG TAGACAAACC AGTTATTCGC AAGCCAACAG ACGCTATTGT 120 

GCGTATTGTA AAAACCACTA TTTGTGGAAC AGACCTCCAT ATTATCAAAG GGGATGTTCC 180 

TACTTGCCAA AGTGGTACCA TTCTTGGCCA CGAAGGGATT GGGATTGTTG AAGAAGTTGG 240 

GGAAGGAGTT TCCAACTTCA AAAAAGGTGA CAAGGTCTTG ATTTCTTGCG TCTGTGCCTG 300 

TGGTAAATGC TACTACTGTA AAAAAGGAAT TTATGCTCAC TGTGAAGACG AAGGGGGCTG 360 

GATTTTCGGT CACTTGATTG ATGGTATGCA GGCTGAATAT CTACGTGTCC CTCATGCAGA 420 

TAATACTCTT TACCATACTC CAGAAGACTT GTCAGATGAA GCTTTGGTTA TGCTGTCAGA 480 

CATTCTGCCT ACTGGATATG AAATTGGTGT CTTAAAAGGG AAAGTAGAAC CTGGTTGCAG 540 

CGTAGCCATT ATTGGTTCAG GTCCAGTTGG ATTGGCTGCT CTTTTAACAG CCCAATTCTA 600 

TTCACCAGCT AAATTGATTA TGGTAGACCT AGACGATAAC CGCTTGGAAA CTGCCCTATC 660 

ATTCGGTGCG ACTCATAAGG TTAATTCTTC AGACCCTGAA AAAGCCATTA AAGAAATTTA 720 

TGATTTGACA GATGGTCGTG GTGTGGATGT CGCTATCGAA GCTGTTGGTA TTCCTGCAAC 780 

ATTTGATTTC TGTCAAAAGA TTATCGGTGT AGACGGAACG GTTGCCAACT GTGGTGTGCA 840 

TGGTAAACCA GTTGAATTCG ATTTAGATAA ACTTTGGATT CGCAACATCA ATGTAACAAC 900 

TGGTTTGGTA TCTACAAATA CGACTCCACA ATTGTTGAAA GCACTTGAAA GTCATAAGAT 960 

TGAACCGGAA AAATTGGTAA CTCACTATTT CAAACTCAGT GAAATTGAAA AAGCCTACGA 1020 

AGTCTTCAGT AAGGCAGCAG ACCACCATGC CATTAAGGTC ATTATCGAAA ACGATATCTC 1080 

AGAAGCCTAA GTAGTAAAAA TATTTTTGTA CATAAGTAAA TAGAAATTCA GTCATCCATC 1140 

AGATGGCTGG ATTTTTTATC AAAAAATTAA GAAATGAGCA TATTTCTTTC CTTGTCTGGC 1200 

GGAATTGGTT ATAATATACG GTACAAAGGA ATGAATGAAT ATGTATCGTG TTATAGAAAT 1260 

GTACGGAGAT TTTGAACCGT GGTGGTTCTT AGAAGGTTGG GAAGAAGATA TTGTAGCAAG 1320 

TAGAAAATTT GACCAGTATT ATGATGCTCT CAAATACTAC AAAACTTGCT GGTTTAGATT 1380 

GGAACAAGAA TCGCCTCTTT ATAAAAGTAG AAGCGACTTG ATGACCATTT TTTGGGACCC 1440 

GGAAGACCAA CGCTGGTGTG ATGAATGTGA TGAGTATTTA CAACAATACC ATTCTTTGGC 1500 

TCTTTTGCAG GATGAGCAGG TTATCCCAGA CGAAAAACTA CGCTCAGGCT ATGAAAAACA 1560 

AACCAGTCAG GAAAGGAATC GTTCTTGCCG TATGAAATTA AAATAGAGAA AAGTAACTTT 1620 

TTTGGAGTTG CTTTTTTTAT TTTTCTAACT CTTTGCGAAT AGTATAGGTG AGGAGGTAAG 1680 
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TATGGTTCAA 


GAAATTGCAC 


AAGAAATCAT 


TCGTTCAGCT 


CGGAAAAAAG 


GGACGCAGGA 


1740 


TATCTATTTT 


GTCCCTAAGT 


TAGACGCCTA 


TGAGCTTCAT 


ATGAGGGTAG 


GAGACGAGCG 


1800 


CTGTAAAATT 


GGTAGCTATG 


ATTTTGAAAA 


GTTTGCAGCC 


GTTATCAGTC 


ACTTTAAGTT 


1860 


TGTGGCGGGT 


ATGAATGTGG 


GAGAAAAAAG 


ACGTAGTCAA 


CTGGGTTCCT 


GTGATTATGC 


1920 


CTATGACCAT 


AAGATAGCGT 


CTCTACGTTT 


ATCTACTGTA 


GGCGATTATC 


GGGGGCATGA 


1980 


GAGTTTGGTT 


ATCCGTTTGT 


TGCACGATGA 


GGAGCAGGAC 


CTGCATTTTT 


GGTTTCAGGA 


2040 


TATTGAAGAA 


TTAGGCAAGC 


AGTACAGGCA 


ACGGGGACTC 


TATCTTTTTG 


CTGGTCCGGT 


2100 


TGGGAGTGGT 


AAGACGACCT 


TGATGCATGA 


ATTGTCCAAG 


TCACTCTTTA 


AAGGACAGCA 


2160 


AGTTATGTCC 


ATCGAAGATC 


CTGTCGAAAT 


CAAGCAGGAC 


GACATGCTTC 


AGTTGCAGTT 


2220 


GAACGAAGCA 


ATCGGCCTAA 


CCTATGAAAA TCTAATCAAA 


CTTTCCTTGC 


GTCATCGACC 


2280 


AGATCTCTTG 


ATTATCGGAG 


AAATTCGTGA 


CAGCGAGACG 


GCGCGTGCAG 


TGGTCAGAGC 


2340 


TAGTTTGACA 


GGTGCGACAG 


TCTTTTCAAC 


CATTCACGCC 


AAGAGTATCC 


GAGGTGTTTA 


2400 


TGAGCGTCTG 


CTGGAGTTGG 


GTGTGAGTGA 


AGAAGAATTG 


GCAGTTGTTC 


TGCAAGGAGT 


2460 


CTGCTACCAG 


AGATTAATCG 


GGGGAGGAGG 


AATCGTTGAC 


TTTGCAAGCA GAGATTATCA 


2520 


AGAACACCAA 


GCAGCCAAGT 


GGAATGAGCA 


AATTGACCAG 


CTTCTTAAAG ATGGACATAT 


2580 


CACAAGTCTT 


CAGGCTGAGA 


CGGAAAAAAT 


TAGCTACAGC 


TAAGCAAAAA 


AATATCATCA 


2640 


CCCTATTTAA 


CAATCTCTTT 


TCTAGCGGTT 


TTCATCTGGT 


GGAGACTATC 


TCCTTTTTAG 


2700 


ATAGGAGTGC 


TTTGTTGGAC 


AAGCAGTGTG 


TGACCCAGAT 


GCGTGTGGGC 


TTGTCTCAGG 


2760 


GGAAATCATT 


CTCAGAAATG 


ATGGAAAGTT 


TGGGATGTTC 


AAGTGCTATT 


GTCACTCAGT 


2820 


TATCCCTAGC 


TGAAGTTCAT 


GGCAATCTCC 


ACCTGAGTTT 


GGGAAAGATA 


GAAGAATATC 


2880 


TGGACAATCT 


GGCTAAGGTC 


AAGAAAAAAT 


TGATTGAAGT 


AGCGACCTAT 


CCCTTGATTT 


2940 


TGCTGGGTTT 


TCTTCTCTTA 


ATTATGCTGG 


GGCTACGGAA 


TTACCTGCTC 


CCACAACTGG 


3000 


ATAGTAGCAA 


TATTGCCACC 


CAAATTATCG 


GTAATCTGCC 


CCAAATTTTT 


CTAGGCATGG 


3060 


TAGGGCTTGT 


TTCCGTGCTT 


GCCCTTTTAG 


CACTCACTTT 


TTATAAAAGA 


AGTTCTAAGA 


3120 


TGAGTGTCTT 


TTCTATCTTA 


GCACGCCTTC 


CCTTTATTGG 


AATCTTTGTG 


CAGACCTACT 


3180 


TGACAGCCTA 


TTATGCACGT 


GAATGGGGGA 


ATATGATTTC 


ACAGGGAATG 


GAGTTGACGC 


3240 


AGATTTTTCA AATGATGCAG 


GAACAAGGTT 


CCCAGCTCTT 


TAAAGAAGTC GGTCAAGATC 


3300 


TGGCTCAAAC 


CCTGAAAAAT 


GGCCGTGAAT 


TTTCTCAGAC 


GATAGGAACC 


TATCCTTTCT 


3360 


TTAGGAAGGA ATTGAGTCTC ATCATAGAGT ATGGGGAAGT TAAGTCCAAG CTGGGTAGTG 


3420 


AGTTGGAAAT 


CTATGCTGAA 


AAAACTTGGG 


AAGCCTTTTT 


TACCCGAGTC 


AACCGCACCA 


3480 
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TGAATTTGGT 


GCAGCCACTG 


GTTTTTATCT 


TTGTGGCACT 


CjATTATCGTT 


TTACTTTATG 


3540 


CGGCAATGCT 


CATGCCCATG 


1A1LAAAA1A 


I UfiAGGT AAA 


TTTTTAAAAT 


GAAAAAAATG 


3600 


ATGACATTCT 


TGAAAAAAGC 


TAAGGTTAAA 


GCTTTTACAT 


TGGTGGAGAT 


GTTGGTGGTC 


3660 


TTGCTGATTA 


TCACCGTGCT 


TTTCTTGCTC 


TTTGTACCTA 


ATCTGACCAA 


GCAAAAAGAA 


3720 


GCAGTCAATG 


ACAAAGGAAA 


AGCAGCTGTT 


GTTAAGGTGG 


TGGAAAGCCA 


GGCAGAACTT 


3780 


TATAGCTTAG 


AAAAGAATGA 


AGATGCTAGC 


CTAAGAAAGT 


TACAAGCAGA 


TGGACGCATC 


3840 


ACGGAAGAAC 


AGGCTAAAGC 


TTATAAAGAA 


TACAATGATA 


AAAATGGAGG 


AGCAAATCGT 


3900 


AAAGTCAATG ATTAAGGCCT 


TTACCATGCT 


GGAAAGTCTC 


TTGGTTTTGG 


GACTTGTGAG 


3960 


TATCCTTGCC 


TTGGGCTTGT 


CCGGCTCTGT 


CCAGTCCACT 


TTTTCAGCGG 


TAGAGGAACA 


4020 


GATTTTCTTT 


ATGGAGTTTG 


AAGAACTCTA 


TCGGGAAACC 


CAAAAACGCA 


GTGTAGCCAG 


4080 


TCAGCAAAAG ACTAGTCTGA ACTTAGATGG GCAGACGCTT AGCAATGGCA GTCAAAAGTT 


4140 


GCCAGTCCCT AAAGGAATTC 


AGGCCCCATC 


AGGCCAAAGT 


ATTACATTTG 


ACCGAGCTGG 


4200 


GGGCAATTCG 


TCCCTGGCTA 


AGGTTGAATT 


TCAGACCAGT 


AAAGGAGCGA 


TTCGCTATCA 


4260 


ATTATATCTA 


GGAAATGGAA 


AAATTAAACG 


CATTAAGGAA 


ACAAAAAATT 


AGGGCAGTGA 


4320 


TTTTACTGGA 


AGCAGTAGTC 


GCTCTAGCTA 


TCTTTGCCAG 


CATTGCGACC 


CTCCTTTTGG 


4380 


GACAAATTCA 


AAAAAATAGG 


CAAGAGGAAG 


CAAAAATCTT 


GCAAAAGGAA 


GAAGTCTTGA 


4440 


GGGTAGCTAA 


GATGGCCCTG 


CAGACGGGGC 


AAAATCAGGT 


AAGCATCAAC 


GGAGTTGAGA 


4500 


TTCAGGTATT 


TTCTAGTGAA 


AAAGGATTGG 


AGGTCTACCA 


TGGTTCAGAA 


CAGTTGTTGG 


4560 


CAATCAAAGA 


GCCATAAGGT 


CAAGGCTTTT 


ACCTTGTTAG 


AATCCCTGCT 


TGCCCTCATT 


4620 


GTCATCAGTG 


GGGGATTACT 


CCTTTTTCAA 


GCTATGAGTC 


AGCTCCTCAT 


TTCAGAAGTT 


4680 


CGCTACCAGC 


AACAAAGCGA 


GCAAAAGGAG 


TGGCTCTTGT 


TTGTGGACCA 


ACTTGAGGTA 


4740 


GAATTAGACC 


GTTCGCAGTT 


CGAAAAAGTA 


GAAGGCAATC 


GCCTATACAT 


GAAGCAAGAT 


4800 


GGCAAGGACA 


TCGCCATCGG 


TAAGTCAAAG 


TCAGATGATT 


TCCGTAAAAC 


GAATGCTCGT 


4860 


GGTCGAGGTT ATCAGCCTAT GGTTTATGGA CTCAAATCTG 


TACGGATTAC 


AGAGGACAAT 


4920 


CAACTGGTTC 


GCTTTCATTT 


CCAGTTCCAA 


AAAGGCTTAG 


AAAGGGAGTT 


CATCTATCGT 


4980 


GTGGAAAAAG 


AAAAAAGTTA AGGCAGGTGT 


TCTCCTCTAC 


GCAGTCACCA 


TAGCAGCCAT 


5040 


CTTTAGTCTT 


TTGTTGCAAT 


TTTATTTGAA 


CCGACAAGTC 


GCCCACTATC 


AAGACTATGC 


5100 


TTTGAATAAA 


GAAAAATTGG 


TTGCTTTTGC 


TATGCCTAAA 


CGAACCAAAG ATAAGGTTGA 


5160 


GCAAGAAAGT 


GGGGAACAGT 


TTTTTAATCT 


AGGTCAGGTA 


AGCTATCAAA 


ACAAGAAAAC 


5220 
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TGGCTTAGTG 


ACGAGGGTTC 


GTACGGATAA 


GAGCCAATAT 


GAGTTTCTGT 


TTCCTTCAGT 


5280 


CAAAATCAAA 


GAAGAGAAAA 


GAGATAAAAA 


GGAAGAGGTA 


GCGACCGATT 


CAAGCGAAAA 


5340 


AGTGGAGAAG 


AAAAAATCAG 


AAGAGAAGCC 


TGAAAAGAAA 


GAGAATTCAT 


AGTCAATTCA 


5400 


ACTATAATGC 


GTTGAATCCA 


GAATAGTCCA 


CTGTAGTTTC 


TAGAAAATTG 


CTGGAAATGG 


5460 


ATGTTAAGCT 


CCAATTCATT 


TGTTTATATC 


TTATTTCAGT 


TTACTATACT 


TTGTGCTAAA 


5520 


TTAAAGATAT 


GAAACATGAT 


TTTAACCACA 


AAGCAGAAAC 


TTTCGATTCC 


CCTAAAAATA 


5580 


TCTTCCTCGC 


AAACTTGGTA 


TGTCAAGCAG 


CCGAGAAACA 


GATTGATCTT 


CTATCAGACA 


5640 


AAGAAATTTT AGATTTCGGT 


GGTGGCACGG 


GTCTATTAGC 


CTTGCCCCTA ACCCCTAGCC 


5700 


AAGCAGGCTA AGTCAGTCAC 


TCTTGTAGAC 


ATTTCTGAGA AAATGTTGGA GCAAGCTCGT 


5760 


TTGAAAGTGG 


AGCAGCAAGC 


AATCAAGAAT 


ATCCAGTTTT 


TGGAGCAAGA 


TTTACCGAAA 


5820 


AATCCCTTGG 


AGAAAGAGTT 


TGATTGCCTT 


GCTGTTAGTC 


GGGTTCTTCA 


TCATATGCCT 


5880 


GATTTGGATG 


CGGCTCTCTC 


ACTGTTTCAT 


CAACATTTGA 


AGGAAGATGG 


GAAACTCATC 


5940 


ATTGCTGATT 


TTACCAAGAC 


AGAAGCTAAT 


CATCATGGAT 


TTGATTTAGC 


TGAACTGGAA 


6000 


AACAAGCTAA 


TTGAGCATGG 


TTTTTCATCT 


GTGCATAGTC 


AGATTCTCTA 


TAGTGCTGAA 


6060 


GACCTGTTTC 


AAGGAAATCA 


CTCAGAATTC 


TTTTTAATAG 


TAGCCCAAAA 


ATCACTCGCC 


6120 


TAGTCAGGGA 


GTGATTTTTC 


TATAAGGATG 


GAAAAAAGAA 


GGGAAATTTG 


GTAAGATAGG 


6180 


AATATGGATT 


TTGAAAAAAT 


TGAACAAGCT 


TATACCTATT 


TACTAGAGAA 


TGTCCAAGTC 


6240 


ATCCAAAGTG 


ATTTGGCGAC 


CAACTTTTAT 


GACGCCTTGG 


TGGAGCAAAA 


TAGCATCTAT 


6300 


CTGGATGGTG 


AAACTGAGCT 


AAACCAGGTC 


AAGGAGAACA 


ATCAAACCCT 


TAAGCGTTTA 


6360 


GCACTACGCA 


AAGAAGAATG 


GCTCAAGACC 


TACCAGTTTC 


TCTTGATGAA 


GGCTGGGCAA 


6420 


ACAGAACCCT 


TGCAGGCCAA 


TCACCAGTTT 


ACACCGGATG 


CTATTGCTTT 


GCTTTTGGTG 


6480 


TTTATTGTGG 


AAGAGTTGTT 


TAAAGAGGAG 


GAAATTACTA 


TCCTCGAAAT 


GGGTTCTGGG 


6540 


ATGGGAATTC 


TAGGCGCTAT 


TTTCTTGACC 


TCGCTTACTA 


AAAAGGTGGA 


TTACTTGGGA 


6600 


ATGGAAGTGG 


ATGATTTGCT 


GATTGATCTG 


GCAGCTAGCA 


TGGCAGATGT 


AATTGGTTTG 


6660 


CAGGCTGGCT 


TTGTCCAAGG 


AGATGCCGTT 


CGCCCACAAA 


TGCTCAAAGA 


AAGCGATGTG 


6720 


GTCATCAGTG 


ACTTGCCTGT 


CGGCTATTAT 


CCTGATGATG 


CCGTTGCGTC 


GCGCCATCAA 


6780 


GTTGCTTCTA GCCAAGAACA TACTTACGCC CATCACTTGC TCATGGAACA AGGGCTTAAG" 


6840 


TACCTCAAGT 


CAGACGGATA 


CGCTATTTTT 


CTAGCTCCGA 


GTGATTTGTT 


GACCAGTCCT 


6900 


CAAAGTGATT 


TGTTAAAAGA 


ATGGCTGAAA 


GAAGAGGCGA 


GTCTGGTTGC 


TATGATTAGT 


6960 


CTGCCTGAAA ATCTCTTTGC 


TAATGCCAAA 


CAATCTAAGA 


CTATTTTTAT 


CTTACAGAAG 


7020 
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AAAAATGAAA 


TAGCAGTAGA 


GCCTTTTGTT 


TATCCACTTG 


CTAGCTTGCA 


AGATGCAAGT 


7080 


GTTTTAATGA 


AATTTAAAGA 


AAATTTTCAA 


AAATGGACTC 


AAGGTACTGA 


AATATAAAAT 


7140 


AGATTTTGTT 


ATAATAGTTG 


AAAACGCTTA AAAAGGGGTA 


TCATGTTATG 


ACAAAAACAA 


7200 


TTGCAATCAA 


TGCAGGAAGT 


TC A AfiTTTf; A 

X ill V3A 


nhluuLnnl 1 


ATACTTAATG 


CCAGAAGAAA 


7260 


AAGTATTGGC 


GAAAGGTTTG 


ATTGAACYVTA 


1 V.UU 1 i I V?AA 


AGATTCAATT 


TCAACTGTAA 


7320 


AATTTGACGG 


CCGTTCTGAA 


C A A C A A A T^PT 


x VjuA 1 A 1 1 uA 


AAATCATATA 


CAAGCCGTTA 


7380 


AAATTTTATT 


GGATGACTTG 


ATT Ofi TTTPfi 


ATATT ATP A A 
AxAl inlvnn 


GGCTTATGAC 


GAG ATT AC AG 


7440 


GTGTTGGACA 


TCGTGTTGTT 


GCTGGTGGAG 


AAT ATTTP A A 


AGAATCAACA 


GTTGTTGAGG 


7500 


GAGATGTTTT AGAAAAAGTT 


V2AAV1AU X X U A 


ul I 1 vj 1 lubL 


TCCTCTACAC 


AACCCGGCCA 


7560 


ATGCAGCAGG 


TGTTCGTGCC 


TTPA AfifIA AT 

X X CAnUUAA X 




CATTACCAGT 


GTAGTTGTTT 


7620 


TTGATACTTC 


CTTCCACACA 


AVJ 1 /1 1 Vj*— Ulu 


Af*A A jri/vnmik 


TCGCTACCCT 


CTACCAACAA 


7680 


AAT ATT ACAC 


AGAAAACAAG 


ui 1 V_\j x AAA 1 


f\\~yj3\j r\3\^ 1 {_ A 


TGGTACAAGT 


CACCAGTTTG 


7740 


TAGCAGGAGA 


AGCTGCAAAA 


1a. 1 luuuAC 


ffTV/"' a tut**/-" a 


AGACTTGAAG 


TTAATTACCT 


7800 


GTCATATTGG 


TAACGGAGGC 


1 Li\n 1 1 A \, /\\J 


LHj ivAAAuC 


CGGCAAATCT 


GTAGACACTT 


7860 


CTATGGGGTT 


CACTCCTCTT 


<JVJ 4 W X A X 1 A 




GCGTACAGGG 


GATATTGATC 


7920 


CAGCTATCAT 


TCCTTATTTA 


ATGCAATATA 


PAfiAfSfiATTT 


TAACACACCA 


GAAGATATCA 


7980 


GTCGTGTTCT 


TAACCGTGAA 


TCAGGTCTTT 


TGGGAGTTTC 


TGCTAATTCT 


AGCGATATGC 


8040 


GCGATATAGA 


AGCAGCTGTA 


GCAGAAGGGA 


ATCACGAGGC 


TAGCTTGGCT 


TATGAAATGT 


8100 


ATGTTGACCG 


TATCCAAAAA 


CATATCGGTC 


AGTACCTTGC 


AGTGCTAAAT 


GGAGCAGATG 


8160 


CCATTGTTTT 


CACAGCAGGT 


GTCGGTGAAA 


ATGCAGAGAG 


TTTCCGTCGT 


GATGTAATCT 


8220 


CAGGGATTTC 


GTGGTTTGGT 


TGTGATGTTG 


ATGATGAAAA 


GAATGTCTTT 


GGCGTTACAG 


8280 


GAGACATCTC 


AACAGAGGCA 


GCTAAAATCC 


GTGTCTTGGT 


TATTCCAACA 


GATGAAGAAT 


8340 


TAGTCATTGC 


CCGTGACGTT 


GAACGCTTGA 


AAAAATAAGT 


GAAACTAAAA 


AAATATTCAA 


8400 


TACAAGGAGT 


TGGGAAAGTT 


ATTTTTCCAG 


CTTCTTTTTC 


TGATGAAATT 


GTCCAAAACC 


8460 


TTGCTATGAT 


TGGCTTTTTT 


GAAAAATATG 


GTATAATAGT 


AGTAATTTAA 


TAGATGGAGT 


8520 


TGAGTTTTGA 


AGAAAAACTT 


TCGTGTAAAA 


AGAGAGAAAG 


ATTTTAAGGC 


GATTTTCAAG 


8580 


GAGGGGACAA 


GTTTTGCTAA 


TCGCAAATTT 


GTGGTCTACC 


AATTAGAAAA 


CCAGAAAAAC 


8640 


CGTTTTCGAG 


TAGGTCTATC 


AGTTAGCAAA 


AAACTGGGGA 


ATGCCGTCAC 


TAGAAATCAA 


8700 


ATTAAGCGAC 


GGATTCGGCA 


TATTATCCAG 


AATGCAAAAG 


GGAGTCTGGT 


AGAAGATGTC 


8760 
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GACTTTGTTG 


TCATTGCTCG 


AAAAGGAGTC 


GAAACCTTGG GATACGCAGA 


GATGGAGAAA 


8820 


AATCTACTCC 


ATGTATTAAA 


ATTATCAAAG 


ATTTACCGGG AAGGAAATGG 


GAGTGAAAAA 


8880 


GAAACTAAAG 


TTGACTAGTT 


TGCTAGGACT 


GTCTCTGTTA ATCATGACAG 


CCTGTGCGAC 


8940 


TAATGGGGTA 


ACTAGCGATA 


TTACAGCCGA 


ATCGGCTGAT TTTTGGAGTA 


AATTGGTTTA 


9000 


CTTCTTTGCG 


GAAATCATTC 


GCTTTTTATC 


GTTTGATATT AGTATCGGAG 


TGGGGATTAT 


9060 


TCTCTTTACG 


GTCTTGATTC 


GTACAGTCCT 


CTTGCCAGTC TTTCAGGTGC 


AAATGGTGGC 


9120 


TTCTAGGAAA ATGCAGGAAG 


CTCAGCCACG 


CATTAAGGCG CTTCGAGAAC 


AATATCCAGG 


9180 


TCGAGATATG 


GAAAGCAGAA 


CCAAACTAGA 


GCAGGAAATG CGTAAAGTAT 


TTAAAGAAAT 


9240 


GGGTGTCAGA 


CAGTCAGACT 


CTCTTTGGCC 


GATTTTGATT CAGATGCCGG 


TTATTTTGGC 


9300 


CCTGTTCCAA 


GCCCTATCAA 


GAGTTGACTT 


TTTAAAGACA GGTCATTTCT 


TATGGATTAA 


9360 


CCTTGGTAGT 


GTGGATACAA 


CCCTTGTTCT 


TCCGATTTTA GCAGCAGTAT 


TCACCTTTTT 


9420 


AAGTACTTGG 


TTGTCCAACA 


AAGCTTTGTC 


TGAGCGAAAT GGCGCTACGA 


CTGCGATGAT 


9480 


GTATGGGATT 


CCAGTCTTGA 


TTTTTATCTT 


TGCAGTTTAT GCGCCAGGTG 


GAGTCGCCCT 


9540 


ATACTGGACA 


GTGTCTAATG 


CTTATCAAGT 


CTTGCAAACC TATTTCTTGA ATAATCCATT 


9600 


CAAGATTATC 


GCAGAGCGCG 


AGGCCGTAGT 


ACAGGCACAA AAAGATTTGG 


AAAATAGAAA 


9660 


AAGAAAAGCC 


AAGAAAAAGG 


CTCAGAAAAC 


GAAATAAATA AGGAGGAATC 


TGGTAGTGGT 


9720 


AGTATTTACA 


GGTTCAACTG 


TTGAAGAAGC 


AATCCAGAAA GGATTGAAAG 


AATTAGATAT 


9780 


TCCAAGAATG AAGGCTCATA TCAAAGTCAT TTCTAGGGAG AAAAAAGGCT TTCTTGGTCT 


9840 


ATTTGGTAAA 


AAACCAGCCC 


AAGTGGATAT 


TGAAGCGATT AGTGAAACGA 


CTGTTGTCAA 


9900 


AGCAAATCAA 


CAGGTAGTAA 


AAGGCGTTCC 


GAAAAAAATC AATGATTTGA 


ACGAGCCTGT 


9960 


GAAGACGGTT 


AGTGAAGAAA 


CCGTTGACCT 


TGGTCATGTG GTTGATGCTA 


TTAAAAAAAT 


10020 


AGAGGAAGAA 


GGTCAAGGTA 


TTTCTGATGA 


AGTCAAGGCT GAAATCTTAA 


AACATGAAAG 


10080 


ACATGCCAGC 


ACTATCTTAG 


AAGAAACTGG 


TCACATTGAG ATTTTAAATG 


AACTTCAAAT 


10140 


CGAGGAAGCG 


ATGAGGGAAG 


AAGCAGGCGC 


TGATGACCTT GAAACTGAGC 


AAGACCAAGC 


10200 


TGAAAGTCAA GAACTAGAAG ACTTGGGCTT GAAAGTTGAA ACGAACTTTG 


ATATTGAACA 


10260 


AGTAGCTACG 


GAAGTAATGG 


CTTATGTTCA 


AACGATTATT GATGACATGG 


ATGTTGAGGC 


10320 


TACACTTTCA AATGATTATA ACCGTCGTAG CATCAATCTA CAAATTGACA CCAACGAACC 


10380 


AGGTCGTATT 


ATCGGCTACC 


ATGGTAAAGT 


CTTGAAGGCC TTGCAACTGT 


TGGCTCAAAA 


10440 


TTATCTTTAC AACCGCTATT 


CCAGAACCTT 


CTACGTTACA ATCAATGTCA 


ATGATTATGT 


10500 


CGAACACCGT GCAGAAGTCT 


TGCAGACCTA 


TGCGCAAAAA TTGGCGACTC 


GTGTTTTGGA 


10560 
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AGAAOGftrGr 


r\\j i \_>i X AAAA 


r» ft ft ip/^/^ ft ft m 
l_ AuA 1 LLAA 1 


f*TTyf> ft ft ft fn ft f>f 
(j 1 LAAA i AGC 


GAACGCAAGA 


TTATCCATCG 


10620 


1MI Inll 1 l_>i 


1 A rfeUAi\> 


/>/v , mr' ft ^"vrt ft /"» 


TTACrTCTGAA 


GGTGATGAGC 


CAAATCGCTA 


10680 


i. vj 1 1 Vj 1 1 \j 1 J\ 


K3A 1 At. AGAAT 


ft ft /~».TI ft ft Tt * J IV » 

AAGTAAAATC 


AGGTTTATCC 


TGATTTTTTG 


CTAGTTAGAG 


10740 




1 ajA J vjIITJAA 


niftftf^ftfflft ft^ft 
TAAGATAAGA 


GACTATTTAG 


ACTTTGCTGG 


TTTGCAGTAC 


10800 




ft 11 ft ft Tir*tT*/>f> 


AGCAGAGCGA 


GAGAAGATGC 


TGGCATTCCG 


CCACAAAGGA 


' 10860 


CAAuAliui.l.v. 


GAAAGGTTTT 


TACAGAACTG 


GCCAAAGCCT 


TTCAAGCAAG 


CCATCCAGAA 


10920 


fP/" , /" , /" , ft A / ry fT'/ m */* > 

I WjW\At. 1 


AACAGACTAG 


CCAGTGGATG 


AATCAGGCCC 


AGCGTTTGAG 


ACCAGATTTT 


10980 


1 VjVjLj 1I1A1L 


TACAGAGAGA 


CGGACAAGTG 


ACAGAACCTA 


TGATGGCCTT 


ACGTrTGTAT 


11040 




CTGACTTTGG 


AATTTCTTTG 


GAAGTCAGTT 


TCATCGAACG 


TAAGAAGGAT 


11100 




TGGGCAAGCA 


GGCCAAAGTT 


TTAGACATTC 


CAACCGTTAA 


AGGGATTTAT 


11160 


ip A TT^P ft ft 


AGTCTAATGG 


TCAAAGTCAA 


CGGTGGGAGG 


CGAATGAAGA 


AAAGCGTCGT 


11220 




ft ft ft p/vno » y» 
AGAAGGTGAG 


AAGTCAAGAA 


GTTCGAAAAG 


TTTTAGTGAA 


GGTAGATGTT 


11280 


1>L.1A1uA(JA\j 


AAAATTCGTC 


TGAAGAAGAA 


ATCGTAGAAG 


GCTTATTGAA 


GTCTTATTCT 


11340 


A ?i ft ft TMIVVpm^ 
AAAAI i J. C_ 


CCTATTATCT 


AGCTACGAGA 


AAATAAGATA 


ATTTGTAAAA 


CATCATAAAT 


11400 


L A I ALAG rCC 


AAGAGTGAAC 


AGTCCGCTGT 


GTAATTCTTG 


GTCTTTTTGT 


TTGCGCTTTC 


11460 


uVJA 1 1 A 1 ATA 


ATAAACTTAC 


AAAAACAATT 


CAAAAGGAGA 


ACAATTATGG 


AAGTCGTTTC 


11520 


rt/ilj 1 \j 11C 1 A 


AATTGGTTTT 


CTAGCAATAT 


TTTGCAGAAT 


CCCGCATTTT 


TCGTAGGTTT 


11580 


nl 1 Vj i J. V>3 


ft rn ft ft rn ft mr^ 


/^ft OflVTlfTIM IS ft 

CACTTTTGAA 


71 74 71 74 ftm/** 

AAAACCTGCC 


CATGACGTTT 


TTTCAGGGTT 


11640 




A*. AVj 1 nuuvj 1 


AI A i Vj i IvjL. 1 


TAACGTGGGT 


GCTGGTGGTT 


TGGTTACAAC 


11700 




1 Al3\.AVJ 




Plj ft 'VV^T" ft ft 


A "rGGTGCAG 


CGGTTATCGA 


11760 


CCCTTACTTT 


GGACTTGCTG 


CAGCAAACAA 


CAAAATTGTA 


GCAGAGTTTC 


CAGATTTTGT 


11820 


TGGAACTGCA 


ACTACAGCTC 


TATTGATTGG 


TTTTGGAATA 


AATATCTTGC 


TCGTAGCTCT 


11880 


TCGAAAGATT 


ACGAAGGTAA 


GAACCCTCTT 


TATTACTGGT 


CACATCATGG 


TACAACAAGC 


11940 


TGCAACAGTA 


TCTCTTATGG 


TTCTATTCTT 


AGTACCACAA 


TTGCGCAATG 


CTTACGGTAC 


12000 


AGCAGCGATT 


GGTATCATCT 


GTGGACTTTA 


CTGGGCAGTT 


AGTTCAAATA 


TGACTGTTGA 


12060 


GGCAACTCAA 


CGCTTGACTG 


GTGGTGGCGG 


ATTTGCGATT 


GGTCACCAAC 


AGCAATTTGC 


12120 


AATCTGGTTT 


GTAGATAAAG 


TAGCAGGACG 


CTTTGGTAAG 


AAAGAAGAAA 


GTTTAGACAA 


12180 


TCTTAAATTA 


CCTAAGTTCC 


TCTCAATCTT 


CCACGATACA 


GTTGTTGCAT 


CTGCTACCTT 


12240 


GATGCTCGTA 


TTCTTCGGAG 


CCATTCTTTT 


AATCTTGGGT 


CCAGACATTA 


TGTCTAATAA 


12300 
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AGAAGTCATC ACTTCAGGAA CTCTATTCAA TCCTGCTAAA CAAGATTTCT 


TTATGTACAT 


12360 


TATCCAAACA GCCTTTACCT 


TCTCAGTTTA 


CTTGTTCGTT 


TTGATGCAAG 


GTGTCCGAAT 


12420 


GTTCGTATCT GAGTTGACAA 


ACGCCTTCCA 


AGGTATTTCA 


AACAAATTGT 


TGCCAGGTTC 


12480 


ATTCCCAGCG GTTGACGTTG 


CAGCTTCTTA 


TGGATTTGGT 


TCTCCAAATG 


CTGTCTTGTC 


12540 


AGGATTTACC TTTGGTTTGA 


TTGGTCAATT 


GATTACAATT 


GTTTTGCTCA 


TCGTCTTTAA 


12600 


AAATCCGATT CTTATTATTA 


CAGGATTTGT 


ACCAGTGTTC 


TTTGACAATG 


CAGCCATTGC 


12660 


GGTCTACGCT GATAAACGCG 


GCGGATGGAA 


AGCGGCTGTT 


ATCCTTTCCT 


TTATATCAGG 


12720 


TGTCCTTCAA GTTGCTCTAG 


GAGCTCTTTG 


TGTGGCCCTT 


CTCGATTTGG 


CATCTTATGG 


12780 


TGGCTACCAT GGAAATATCG 


ACTTTGAATT 


CCCATGGCTT 


GGATTTGGAT ATATCTTCAA 


12840 


ATACCTTGGT ATTGTTGGTT 


ATGTACTTGT 


GTGTCTCTTC TTGCTTGTTA TTCCTCAACT 


12900 


TCAATTTGCC AAAGCAAAAG 


ATAAAGAGAA 


ATATTACAAC 


GGTGAAGTTC 


AAGAAGAAGC 


12960 


TTAGTATCTA GAAAAGGAGA 


AATAAAATGG 


TTAAAGTATT 


AGCAGCGTGC 


GGAAATGGAA 


13020 


TGGGTTCATC AATGGTTATC AAGATGAAGG TTGAAAATGC TCTCCGTAAG CTTAATCAAA 


13080 


CAGATTTTAC AGTCAATTCA 


TGCAGTGTCG 


GTGAAGCTAA 


AGGTTTAGCA 


GTAGGATATG 


13140 


ACATCGTAAT CGCTTCTCTT 


CATTTGATTC 


AAGAATTGGA 


AGGGCGAACT 


AATGGGAAGT 


13200 


TAATTGGGCT TGATAACTTG 


ATGGATGATA 


AAGAAATCAC 


CGAAAAACTC 


AGTCAAGCAC 


13260 


TACAGTAAAA GGTTGGAGGG 


GGCTGGACAG 


AAACTGAGAG 


TTATCGTTTC 


TGTCCTTCTC 


13320 


CCTCTTTAAA TAAAGGAGGC 


AGATATGAAT 


TTAAAACAAG 


CTTTAATTGA 


CAATGACTCG 


13380 


ATCCGACTAG GTTTAGAGGC 


TAACAATTGG 


AAAGAAGCAG 


TCAAGGTAGC 


AGTAGATCCC 


13440 


TTAATTGAAA GTGGGGCAAT 


TTTGCCAGAG 


TATTACGATG 


CTATCATTGA 


ATCGACTGAA 


13500 


GAGTATGGGC CTTACTATAT 


CTTGATGCCA 


GGTATGGCTA 


TGCCCCACGC 


TAGACCTGAA 


13560 


GCAGGTGTGC AAAGTGATGC 


CTTTTCATTG 


ATTACCTTAC 


AAAATCCTGT 


TGTATTTTCA 


13620 


GATGGGAAAG AGGTATCTGT 


TTTGTTGGCA 


CTAGCAGCAA 


CAAGTTCAAA 


AATTCACACA 


13680 


AGTGTAGCCA TTCCACAAAT 


TATTGCCCTA 


TTTGAATTAG 


AAGATTCTAT 


TGCACGTTTA 


13740 


CAGGCTTGCC AGACTAAAGA AGATGTCTTG GCTATGATTG AAGAATCTAA GGATAGCCCT 


13800 


TATCTCGAAG GATTGGATTT 


GGAAAGTTAG 


AAAGAGGAAT 


AAAGAAATGA 


CAAAAAGAAT 


13860 


ACCTAATTTA CAAGTTGCAT TAGACCATTC 


AGACTTGCAA 


GGAGCGATTA 


AAGCAGCTGT 


13920 


TTCTGTTGGT CAGGAAGTAG 


ATATTATCGA 


AGCTGGAACT 


GTTTGCTTGC 


TTCAAGTTGG 


13980 


AAGTGAACTG GCTGAAGTCT 


TGCGTAGCCT 


TTTCCCAGAT 


AAGATTATTG 


TGGCAGACAC 


14040 


AAAATGTGCT GATGCTGGTG 


GAACAGTTGC 


TAAAAATAAT 


GCGGTTCGTG 


GAGCAGACTG 


14100 
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GATGACTTGT ATCTGTTGTG 


CAACCATCCC 


TACTATGGAA 


GCAGCTCTAA 


AGGCTATCAA 


14160 


GACTGAACGA 


GGAGAACGAG 


GCGAAATCCA 


GATCGAGCTT 


TATGGCGATT 


GGACTTTTGA 


14220 


ACAAGCTCAG 


CTTTGGCTAG 


ATGCAGGTAT 


CTCACAAGCT 


ATTTATCACC 


AATCTCGTGA 


14280 


TGCTCTTCTT 


GCTGGTGAAA 


CTTGGGGTGA 


AAAAGACCTT 


AATAAGGTTA 


AAAAACTCAT 


14340 


TGACATGGGC 


TTCCGTGTAT 


CTGTAACAGG 


TGGTCTAGAT 


GTAGATACTC" TCAAACTCTT 


14400 


TGAAGGTATT 


GATGTCTTTA 


CCTTTATCGC 


AGGTCGTGGA 


ATTACAGAGG 


CTGTGGATCC 


J. HHO\J 


AGCAGGAGCA 


GCGCGTGCCT 


TCAAGGATGA 


AATCAAACGA 


ATTTGGGGGT 


AAATCATGGT 




ACGTCCAATT 


GGAATTTATG 


AAAAGGCAAC 


CCCAACACAC 


TGTACTTGGC 


TAGAACGTTT 


i*i JOU 


AAATTTTGCC AAGGAGTTAG 


GCTTTGATTT 


TGTCGAGATG 


TCTATTGACG 


AACGTGACGA 


1 A(\A(\ 
OfiU 


GCGTTTAGCA AGACTTGACT 


GGAGTAAGGA 


AGAACGCTTG 


GAAGTTGTCA AAGCAATCTA 




TGAAACTGGT 


GTTCGTATTC 


CTTCTATCTG 


TTTTTCAGGC 


CATCGTCGCT 


ACCCATTGGR 


i An tin 
±*± / gu 


TTCAAAAGAT 


CCAGTTCTAG 


AGGAAAAATC 


TCTAGAACTC 


ATGAAAAAAT 


GTATCGAATT 




AGCTCAAGAC 


TTGGGAGTTC 


GTACGATTCA 


ATTAGCTGGT 


TACGATGTTT 


ACTATGAGGA 




AAAGTCACCC 


CAGACACGCC 


AACGTTTTAT 


CAAAAATTTG 


AGAAAAGCCT 


GTGACTGGGC 


14940 


TGAAGAAGCT 


CAGGTGGTAC 


TTGCTATTGA AATTATGGAT 


GATCCTTTCA 


TCAGTAGCAT 


15000 


CGAAAAATAT 


TTGGCTATAG 


AAAAAGAGAT 


TGACTCTCCC 


TTCCTCTTTG 


TATATCCAGA 


15060 


TATTGGTAAT 


GTGTCTGCAT 


GGCATAATGA 


TATCTATAGT 


GAGTTTTATC 


TTGGTCATCA 


15120 


TGCCATCGCA 


GCTCTCCATC 


TCAAGGATAC 


TTATGCAGTG 


ACAGAAAGTT 


CAAAGGGCCA 


15180 


GTTCCGAGAT 


GTACCTTTCG 


GGCAAGGTTG 


TGTCAAATGG 


GAAGAAGCTT 


TCGATATTTT 


15240 


AAAGGAAACC 


AATTATAATG 


GACCTTTCCT 


AATCGAAATG 


TGGTCTGAAA 


ATTGTGAAAC 


15300 


AGTAGAAGAA 


ACACGCGCAG 


CCATTCAAGA 


GGCGCAAGCT 


TTTCTCTATC 


CACTCATTAA 


15360 


GAAAGCAGGT 


TTGATGTAAG 


ATGAATCAAG 


TAATCAATGC 


TATGCGTAAA 


CGAGTCTGTG 


15420 


ATGCCAATCA 


ATCATTGCCA 


AAACATGGAC 


TTGTCAAATT 


TACCTGGGGG 


AATGTATCTG 


15480 


AAGTTAATCG 


CGAACTCGGT 


GTCATTGTTA 


TCAAACCATC 


AGGCGTGGAT 


TATGACGAAT 


15540 


TGACACCTGA 


AAACATGGTA 


GTGACTGATC 


TAGATGGTAA 


GATCCTAGAA 


GGGGATTTAA 


15600 


GACCATCTTC 


CGACCTCCCA 


ACTCATGTGC 


AATTATATAA 


GACTTGGTCA 


GAAATTGGTA 


15660 


GTGTGGTTCA CACCCATTCG 


ACAGAAGCTG 


TTGGTTGGGC 


TCAGGCAGGT 


CGTGATATTC 


15720 


CTTTCTACGG 


AACAACCCAT 


GCAGATTATT 


TCTACGGTTC 


AATCCCTTGC 


GCCCGTAGTT 


15780 


TGACCAAGGA 


CGAAGTAGAA 


GTGGCCTATG 


AAAAAGATAC 


TGGCCTGGTT 


ATCGTAGAAG 


15840 
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AGTTTGAACA 


TCGCGGACTT 


AACCCGGTTG 


AAGTACCAGG 


AATTGTTGTA 


CGCAATCACG 


15900 


GTCCATTCAC 


CTGGGGCAAA 


AATCCAGAGA 


ATGCTGTTTA 


TCACTCTGTC 


GTACTAGAGG 


15960 


AAGTATCAAA 


GATGAATCGC 


TTTACAGAAC 


AAATCAATCC 


AAGAGTTGGA 


CCTGCTCCCC 


16020 


AGTACATACT 


AGAAAAACAC 


TACCAACGTA 


AACATGGACC 


AAATGCTTAT 


TATGGTCAAA 


16080 


AGTAAGAACG 


ATGAAGGAGG 


AGAAAAAGAT 


AAATTTAGCT 


CCTCTTTTTA 


CATTTGATTT 


16140 


TTATTGAGAG TAAAGTTGGA GTTGAAGTAA TTTTAAAAGA TTTTTTAGAA ATAGCGCTTG 


15200 


ATATATATAT 


GGTAAAATAA AAAGAATTGC 


TGTGATATCA 


ATAGATTTGG 


GGGATTTTTT 


16260 


AATATGGTAC 


TGGATAAGGC 


AAGTTGTGAT 


TTGCTTCAAT 


ATTTGATGGA 


TCAAGAAACG 


16320 


TCCAAAACGA 


TTATGGCGAT 


TTCGAAAGAT 


TTGAAAGAGT 


CAAGAAGGAA 


AATTTATTAT 


16380 


CACATTGACA 


AAATCAATGC 


TGCTCTGGGT 


GACGAGGCGC 


TTCACATCAT 


TAGTATTCCA 


16440 


CGAATTGGTA 


TTCACTTAAC 


GGAAGAGCAG 


AGAGATGCTT 


GTTGTAAACT 


ATTATCGGAA 


16500 


GTAGATTCGT 


ACGATTATAT 


CATGAGTGCG 


CATGAACGTA 


TGATGATAAT 


GTTACTATGG 


16560 


ATAGGTATTT 


CTAAAGAACG 


TATTACGATT 


GAAAAATTGA 


TAGAGTTAAC 


AGAGGTATCT 


16620 


AGGAATACTG 


TTCTCAATGA 


TTTGAATAGT 


ATTCGTTATC 


AACTAACTTT 


GGAACAATAT 


16680 


CAGGTGATCT 


TGCAAGTGAG 


CAAGTCACAG 


GGATACAACC 


TTCATGCCCA 


CCCTCTTAAT 


16740 


AAAATTCAGT 


ATCTTCAATC 


GCTTCTATAT 


CATATTTTTA 


TGGAAGAAAA 


TGCCACTTTT 


16800 


GTATCTATTT 


TAGAAGATAA 


GATGAAAGAG 


AGGTTAGATG 


ATGAGTGTTT 


GCTTTCTGTT 


16860 


GAAATGAACC 


AATTTTTTAA 


GGAACAGGTT 


CCTTTAGTTG 


AACAAGATTT 


AGGGAAGAAA 


16920 


ATAAACCATC 


ATGAAATAAC 


TTTTATGTTG 


CAGGTTCTAC 


CTTATTTGCT 


GTTAAGCTGT 


16980 


CATAATGTTG 


AACAGTATCA 


AGAAAGACAT 


CAGGATATAG 


AGAAAGAATT 


TTCTTTGATA 


17040 


AGAAAAAGAA 


TAGAGTATCA 


GGTGTCTAAG 


AAATTAGGAG 


AACGGTTGTT 


TCAAAAGTTT 


17100 


GAAATTTCTT 


TGTCAGGACT 


TGAAGTTTCT 


CTTGTAGCTG 


TTCTCCTCCT 


CTCCTATCGT 


17160 


AAAGATTTGG ATATTCATGC 


AGAAAGTGAT 


GATTTTCGGC 


AATTAAAACT 


TGCTTTAGAA 


17220 


GAATTTATCT 


GGTATTTTGA 


ATCACAAATC 


CGAATGGAGA 


TTGAGAACAA 


GGATGATTTG 


17280 


TTACGAAATT 


TGATGATCCA 


CTGTAAAGCC 


TTGTTATTTA 


GAAAGACTTA 


CGGTATTTTT 


17340 


TCTAAAAATC 


CTCTAACAAA 


ACAAATTCGA 


TCCAAGTATG 


GAGAATTATT 


TTTAGTCACT 


17400 


AGAAAATCTG 


CGGAAATTTT AGAAGGAGCA 


TGGTTTATTC 


GGCTAACAGA 


CGATGATATT * 


17460 


GCCTATTTGA 


CGATTCATAT 


TGGAGGATTT 


TTAAAATATA 


CACCATCATC 


TCAAAAAAAT 


17520 


ATGAAAAAAG 


TTTATCTCGT 


TTGTGATGAA 


GGTGTTGCGG 


TTTCGAGACT 


TTTGCTGAAA 


17580 


CAATGCAAAC 


TTTATTTTCC 


AAATGAGCAA 


ATTGACACTG 


TATTTACAAC 


AGAACAATTT 


17640 
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AAGAGTGTGG AAGATATTGC ACAAGTTGAT GTAGTGATTA CTACTAATGA TGATTTGGAT 17700 

AGCAGATTTC CGATTTTAAG GGTTAATCCT ATCCTTGAAG CAGAAGATAT TTTGAAAATG 17760 

CTAGACTATC TTAAACACAA TATATTTCGT AATAAGAGCA AAAGTTTCAG TGAAAATCTT 17820 

TCTAGTCTTA TTTCGTCTTA TATTGTAGAC AGCAAGTTGG CTAGTAAGTT CCAAGAAGAG 17880 

GTTCAAACAC TTATAAATCA AGAAATAGTA GTTCAAGCTT TTTTGGAAGr TATTTGAAGG 17940 

ACAGTCCAAT GATGAACACA AACCTGTGTk TTTCsTGGTC TTTTtTAGTG TTTTGAAGGG 18000 

TGGkATACTA ATCTCAAAGA TAACAATTAT ATCCAAAGGA GGCAACATAT GCCAAACGTC 18060 

AAAGAAATTA CAAGAGAGTC ATGGATTTTA GCCACTTTCC CAGAGTGGpG AACATGGTTG 18120 

AACGAAGAAA TCGAAGAAGA AGTCGTACCT GAAGGCAACT TTGCCATGTG GTGGCTAGGC 18180 

AACTGTGGTA CTTGGATTAA GACACCAGCT GGTGCTAACG TTGTCATGGA CCTTTGGTCA 18240 

AACCGTGGAA AATCAACCAA AAAAGTGAAA GATATGGTTC GTGGGCACCA AATGGCAAAT 18300 

ATGGCAGGTG TTCGTAAGCT GCAACCAAAC TTGCGTGTTC AGCCAATGGT TATCGATCCA 18360 

TTTGCTATCA ACGAACTAGA CTATTACTTA GTTTCACACT TCCACAGTGA TCATATCGAC 18420 

C CAT AC AC AG CTGCAGCAAT TCTCAATAAT CCTAAGTTAG AGCATGTTAA GTTGG 18475 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7186 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

CCAGGATTTG GTACCGTTGC AAGTGGTGTG CCTTTCCTCC TAAAGGAAAA TGGAGGAAAA ' 60 

ATCAATCAAT CAGCACATTC AGATATCAAA GTTGCTAAGG TATTGGTCAA GGATGAAGAT 120 

GAAAAAAATC GCTTGCTTGC AGCAGGGAAT GACTTTAACT TTGTAACCAA TGTGGATGAT 180 

ATTTTATCAG ACCAGGATAT TACTATCGTA GTGGAATTGA TGGGGCGTAT TGAGCCTGCT 240 

AAAACCTTTA TCACTCGTGC " CTTGGAAGCT GGAAAACACG TTGTTACTGC TAACAAGGAC 300 

CTTTTAGCTG TCCATGGCGC AGAATTGCTA GAAATCGCTC AAGCTAACAA GGTAGCACTT 360 

TACTACGAAG CAGCAGTTGC TGGTGGGATT CCAATTCTTC GTACTTTAGC AAATTCCTTG 420 

GCTTCTGATA AAATTACGCG CGTGCTTGGA GTAGTCAACG GAACTTCCAA CTTCATGGTG 480 

ACCAAGATGG TGGAAGAAGG CTGGTCTTAC GATGATGCTC TTGCGGAAGC ACAACGTCTA 540 
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GGATTTGCAG 


AAAGCGATCC 


GACGAATGAC 


GTAGATGGGA 


TTGATGCAGC 


CTACAAGATG 


600 


GTTATTTTGA 


GCCAATTTGC 


CTTTGGCATG 


AAGATTGCCT 


TTGATGATGT 


AGCCCACAAG 


660 


GGAATCCGCA 


ATATCACACC 


AGAAGACGTA 


GCTGTAGCTC 


AAGAGCTTGG 


TTACGTAGTG 


720 


AAATTGGTTG 


GTTCTATTGA 


GGAAACTTCT 


TCAGGTATTG 


CTGCAGAAGT 


GACTCCAACC 


780 


TTCCTACCTA 


AAGCGCACCC 


ACTTGCTAGT 


GTGAATGGCG 


TAATGAACGC 


TGTCTTTGTA 


840 


GAATCTATCG 


GTATTGGTGA 


GTCTATGTAC 


TACGGACCAG 


GTGCGGGTCA 


AAAACCAACT 


900 


GCAACAAGTG 


TTGTAGCTGA 


TATTGTCCGT 


ATCGTTCGTC 


GTTTGAATGA 


TGGTACTATT 


960 


GGCAAAGACT 


TCAACGAATA 


TAGCCGTGAC 


TTGGTCTTGG 


CAAATCCTGA 


AGATGTCAAA 


1020 


GCAAACTACT 


ATTTCTCAAT 


CTTGGCTCTA 


GACTCAAAAG 


GTCAGGTCTT 


GAAGTTGGCT 


1080 


GAAATCTTCA 


ATGCTCAAGA 


TATTTCCTTT 


AAGCAAATCC 


TTCAAGATGG 


CAAAGAGGGT 


1140 


GACAAGGCGC 


GTGTCGTTAT 


CATCACACAC 


AAGATTAATA 


AAGCCCAGCT 


TGAAAATGTC 


1200 


TCAGCTGAAT 


TGAAGAAGGT 


TTCAGAATTC 


GACCTCTTGA 


ATACCTTCAA 


GGTGCTAGGA 


1260 


GAATAAGATG AAGATTATTG 


TACCTGCAAC 


CAGTGCCAAT 


ATCGGGCCAG 


GTTTTGACTC 


1320 


GGTCGGTGTA GCTGTAACCA 


AGTATCTTCA 


AATTGAGGTC 


TGCGAAGAAC 


GAGATGAGTG 


1380 


GCTGATTGAA CACCAGATTG 


GCAAATGGAT 


TCCACATGAC 


GAGCGTAATC 


TCTTGCTCAA 


1440 


AATCGCTTTG 


CAAATTGTAC 


CAGACTTGCA 


ACCAAGACGC 


TTGAAAATGA 


CCAGTGATGT 


1500 


CCCTTTGGCG 


CGCGGTTTGG 


GTTCTTCCAG 


CTCGGTTATC 


GTTGCTGGGA 


TTGAACTAGC 


1560 


CAACCAACTG 


GGTCAACTCA 


ACTTATCAGA 


CCATGAAAAA 


TTGCAGTTAG 


CGACCAAGAT 


1620 


TGAAGGGCAT 


CCTGACAATG 


TGGCTCCAGC 


CATTTATGGT 


AATCTCGTTA 


TTGCAAGTTC 


1680 


TGTTGAAGGG 


CAAGTCTCTG 


CTATCGTAGC 


AGACTTTCCA 


GAGTGTGATT 


TTCTAGCTTA 


1740 


CATTCCAAAC 


TATGAATTAC 


GTACTCGCGA 


CAGCCGTAGT 


GTCTTGCCTA 


AAAAATTGTC 


1800 


TTATAAGGAA 


GCTGTTGCTG 


CAAGTTCTAT 


CGCCAATGTA 


GCGGTTGCTG 


CCTTGTTGGC 


1860 


AGGAGACATG 


GTGACCGCTG 


GGCAAGCAAT 


CGAGGGAGAC 


CTCTTCCATG 


AGCGCTATCG 


1920 


TCAGGACTTG 


GTAAGAGAAT 


TTGCGATGAT 


TAAGCAAGTG 


ACCAAAGAAA ATGGGGCCTA 


1980 


TGCAACCTAC 


CTTTCTGGTG 


CTGGGCCGAC 


AGTTATGGTT 


CTGGCTTCTC 


ATGACAAGAT 


2040 


GCCAACAATT 


AAGGCAGAAT 


TGGAAAAGCA 


ACCTTTCAAA 


GGAAAACTGC 


ATGACTTGAG 


2100 


AGTTGATACC 


CAAGGTGTCC 


GTGTAGAAGC 


AAAATAAAGA 


ATAGAAGATA 


GGATGGGGAA 


2160 


ACTCTTGACC 


AGAGGGGTTC 


ATATCCTTTT 


TGTGAAAAGA 


AGTTTATACT 


CAATGAAAAT 


2220 


CAAAGAGCAA 


ACTAGGAAGC 


TAGCCGCAGG 


CTGCTCAAAA 


CAGTGTTTTG 


AGGTTGCAGA 


2280 


TAGAACTGAC 


GAAGTCAGCT 


CAAGACACTG 


TTTTGAGGTT 


GCAGATAGAA 


CTGACGAAGT 


2340 
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CAGTAACCAT ACTACGGTAA GGTGACGCTG ACGTGGTTTG AAGAGATTTT CGAAGAGTAT 2400 

TAGTTAAAAA CGTGATAAAG GAGAAATAAA GATGGCAGAA ATTTATCTAG CAGGTCGTTG 2460 

TTTTTGGGGC CTAGAGGAAT ATTTTTCACG CATTTCTGGA GTGCTAGAAA CCAGTGTTGG 2520 

CTACGCTAA? GGTCAAGTCG AAACGACCAA TTACCAGTTG CTCAAGGAAA CAGACCATGC 2580 

AGAAACGGTC CAAGTGATTT ACGATGAGAA GGAAGTGTCA CTCAGAGAGA TTTTACTTTA 2640 

TTATTTCCGA GTTATCGATC CTCTATCTAT CAATCAACAA GGGAATGACC GTGGTCGCCA 2700 

ATATCGAACT GGGATTTATT ATCAGGATGA AGCAGATTTG CCAGCTATCT ACACAGTGGT 2760 

GCAGGAGCAG GAACGCATGC TGGGTCGAAA GATTGCAGTA GAAGTGGAGC AATTACGCCA 2820 

CTACATTCTG GCTGAAGACT ACCACCAAGA CTATCTCAGG AAGAATCCTT CAGGTTACTG 2880 

TCATATCGAT GTGACCGATG CTGATAAGCC ATTGATTGAT GCAGCAAACT ATGAAAAGCC 2940 

TAGTCAAGAG GTGTTGAAGG CCAGTCTATC TGAAGAGTCT TATCGTGTCA CACAAGAAGC 3000 

TGCTACAGAG GCTCCATTTA CCAATGCCTA TGACCAAACC TTTGAAGAGG GGATTTATGT 3060 

AGATATTACG ACAGGTGAGC CACTCTTTTT TGCCAAGGAT AAGTTTGCTT CAGGTTGTGG 3120 

TTGGCCAAGT TTTAGCCGTC CGATTTCCAA AGAGTTGATT CATTATTACA AGGATCTGAG 3180 

CCATGGAATG GAGCGAATTG AAGTTCGTTC TCGTTCAGGC AGTGCTCACT TGGGTCATGT 3240 

TTTCACAGAT GGACCGCGGG AGTTAGGCGG CCTCCGTTAC TGTATCAATT CTGCTTCTTT 3300 

ACGCTTTGTG GCCAAGGATG AGATGGAAAA AGCAGGATAT GGCTATCTAT TGCCTTACTT 3360 

AAACAAATAA AACAGAGAGT GGGGCTTCCC ACTTTCTTCA TTTCTAGAAT ATGAATAGAA 3420 

GGGATTTATG AAACACCTAT TATCTTACTT CAAACCCTAC ATCAAGGAAT CAATTTTAGC 3480 

CCCCTTGTTC AAGCTGTTAG AAGCTGTTTT TGAGCTCTTG GTTCCCATGG TGATTGCTGG 3540 

GATTGTTGAC CAATCTTTAC CTCAGGGAGA TCAAGGTCAT CTCTGGATGC AGATTGGCCT 3600 

GCTCCTTATC TTTGCAGTAA TTGGCGTTTT AGTGGCCTTG ATAGCTCAAT TTTACTCAGC 3660 

AAAGGCAGCA GTAGGTTCTG CTAAGGAATT GACAAACGAT CTTTATCGTC ATATTCTTTC 3720 

CTTGCCCAAG GACAGCAGAG ACCGTCTGAC AACTTCTAGT TTGGTCACTC GCTTGACTTC 3780 

GGATACCTAC CAGATTCAGA CTGGTATCAA TCAATTCCTG CGTCTCTTTT TACGAGCGCC 3840 

CATTATCGTT TTTGGTGCCA TTTTTATGGC TTATCGAATC TCAGCTGAGT TGACTTTCTG 3900 

GTTCTTAGTC TTGGTTGCCA TTTTGACCAT TGTCATTGTA GGGTTATCTC GATTGGTCAA 3960 

TCCTTTCTAC AGTAGTCTCA GAAAGAAAAC GGACCAACTG GTTCAGGAAA CGCGCCAGCA 4020 

ATTGCAAGGG ATGCGGGTTA TTCGTGCTTT TGGTCAAGAA AAACGAGAGT TACAGATTTT 4080 
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TCAAACCCTT AACCAAGTTT ATGCTAGATT ACAAGAAAAG ACAGGTTTCT GGTCTAGTTT 4140 

ATTAACACCT CTGACCTATC TGATTGTCAA TGGAACTCTT CTCGTTATTA TCTGGCAAGG 4200 

CTATATTTCA ATTCAAGGAG GAGTGCTCAG TCAAGGTGCT CTCATTGCTC TTATCAATTA 4260 

CCTCTTACAG ATTTTGGTGG AATTGGTCAA GCTAGCCATG TTGATCAATT CCCTCAACCA 4320 

GTCCTATATC TCAGTCAAGC GAATCGAGGA AGTCTTTGTT GAGGCTCCAG AGGATATCCA 4380 

TTCAGAGTTA GAACAAAAGC AAGCTACCAG AGATAAGGTT TTACAAGTCC AAGAATTGAC 4440 

CTTTACCTAT CCTGATGCGG CCCAGCCTTC TCTGAGATAC ATTTCCTTTG ATATGACTCA 4500 

AGGACAAATT CTAGGTATCA TCGGGGGAAC TGGTTCTGGT AAATCAAGCT TGGTGCAACT 4560 

CTTACTTGGA CTTTATCCAG TAGACAAGGG GAACATTGAC CTTTATCAAA ATGGACGTAG 4620 

TCCTCTTAAT TTGGAGCAGT GGCGGTCTTG GATTGCCTAT GTACCTCAAA AGGTCGAACT 4680 

CTTTAAAGGA ACCATTCGTT CCAACTTGAC TCTAGGTTTC AATCAAGAAG TATCTGACCA 4740 

GGAACTCTGG CAGGCCTTGG AGATTGCGCA AGCTAAGGAT TTTGTCAGTG AAAAGGAAGG 4800 

ACTCTTGGAT GCTCTAGTTG AGGCAGGGGG GCGAAATTTC TCAGGTGGAC AAAAACAAAG 4860 

ATTGTCTATC GCCCGAGCAG TCTTGCGCCA GGCTCCGTTT CTCATCCTAG ATGATGCAAC 4920 

CTCGGCACTG GATACCATTA CAGAGTCCAA GCTCTTGAAA GCTATTAGAG AAAATTTTCC 4980 

AAACACGAGC TTAATTTTGA TCTCTCAACG AACCTCAACT TTACAGATGG CGGACCAGAT 5040 

TCTCCTCTTG GAAAAAGGTG AGTTGCTAGC TGTTGGCAAG CACGATGACT TGATGAAATC 5100 

CAGCCAAGTC TATTGTGAAA TCAATGCATC CCAACATGGA AAGGAGGACT AGAATGAAAC 5160 

GACAAACTGT AAACCAGACG CTCAAACGTT TAGCCGTAGA TTTAGCAAGC CATCCTTTCC 5220 

TCCTTTTCCT AGCCTTTCTA GGAACTATTG CCCAAGTTGG CTTATCAATT TACCTACCTA 5280 

TTCTGATTGG GCAGGTCATT GACCAAGTCC TAGTGGCTGG TTCATCACCA GTTTTTTGGC 5340 

AGATTTTTCT CCAGATGCTC TTGGTGGTAA TAGGAAATAC TCTGGTACAA TGGGCCAATC 5400 

CTCTCCTCTA TAATCGTCTA ATCTTCTCTT ATACCAGAGA TTTACGGGAG CGAATCATCC 5460 

ATAAGCTCCA TCGTTTACCG ATTGCCTTTG TAGATAGGCA AGGTAGTGGA GAGATGGTTA 5520 

GTCGTGTAAC CACGGACATC GAACAGTTGG CAGCTGGCTT GACCATGATT TTTAACCAAT 5580 

TTTTCATTGG TGTTTTGATG ATTTTGGTCA GTATTCTAGC CATGCTCCAA ATTCATCTCC 5640 

TCATGACTCT CTTAGTCTTG CTGTTGACGC CACTGTCCAT GGTGATTTCA CGCTTTATTG " 5700 

CCAAGAAATC CTATCATCTC TTCCAGAAGC AAACAGAGAC GAGGGGAATT CAGACTCAGT 5760 

TGATTGAAGA ATCGCTTAGT CAGCAGACTA TAATCCAGTC CTTCAATGCT CAAACAGAAT 5820 

TTATCCAAAG ATTGCGTGAG GCTCATGACA ACTACTCAGG CTATTCTCAG TCAGCCATCT 5880 
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TTTATTCTTC AACGGTCAAT CCTTCGACTC GCTTTGTAAA TGCACTCATT TATGCCCTTT 5940 

TAGCTGGAGT AGGAGCTTAT CGTATCATGA TGGGTTCAGC CTTGACCCTC GGTCGTTTAG 6000 

TGACTTTTTT GAACTATGTT CAGCAATACA CCAAGCCCTT TAACGATATT TCTTCAGTGC 6060 

TAGCTGAGTT GCAAAGTGCT CTGGCTTGCG TAGAGCGTAT CTATGGAGTC TTAGATAGCC 6120 

CTGAAGTGGC TGAAACAGGT AAGGAAGTCT TGACGACCAG TGACCAAGTT AAGGGAGCTA 6180 

TTTCCTTTAA ACATGTCTCT TTTGGCTACC ATCCTGAAAA AATTTTGATT AAGGACTTGT 6240 

CTATCGATAT TCCAGCTGGT AGTAAGGTAG CCATCGTTGG TCCGACAGGT GCTGGAAAAT 6300 

CAACTCTTAT CAATCTCCTT ATGCGTTTTT ATCCCATTAG CTCGGGAGAT ATCTTGCTGG 6360 

ATGGGCAATC CATTTATGAT TATACACGAG TATCATTGAG ACAGCAGTTT GGTATGGTGC 6420 

TTCAAGAAAC CTGGCTCACA CAAGGGACCA TTCATGATAA TATTGCCTTT GGCAATCCTG 6480 

AAGCCAGTCG AGAGCAAGTA ATTGCTGCTG CCAAAGCAGC TAATGCAGAC TTTTTCATCC 6540 

AACAGTTGCC ACAGGGATAC GATACCAAGT TGGAAAATGC TGGAGAATCT CTCTCTGTCG 6600 

GCCAAGCTCA GCTCTTGACC ATAGCCCGAG TCTTTCTGGC TATTCCAAAG ATTCTTATCT 6660 

TAGACGAGGC AACTTCTTCC ATTGATACAC GGACAGAAGT GCTGGTACAG GATGCCTTTG 6720 

CAAAACTCAT GAAGGGCCGC ACAAGTTTCA TCATTGCTCA CCGTTTGTCA ACCATTCAGG 6780 

ATGCGGATTT AATTCTTGTC TTAGTAGATG GTGATATTGT TGAATATGGT AACCATCAAG 6840 

AACTCATGGA TAGAAAGGGT AAGTATTACC AAATGCAAAA AGCTGCGGCT TTTAGTTCTG 6900 

AATAAGCCAT TCTCTTTTGA AAGTTTATGG ACGAAAAAAG TTGCCTTCGA GTGACTTTTT 6960 

TGTTACAATA GCTAGAAAAA TTGTTCACTG TAATACTCAA TGAAAATCAA AGAGCAAACT 7020 

AGGAAGCTAG CCGTAGGTTG CTCAAAGCAC AGCTTTGAGG TTGTAGATAA GACTGACGAA 7080 

GTCAGTTCAA AACACTGTTT TGAGGTTGCA GATAGAACTG ACGAAGTCAG CTCAAAACAC 7140 

TGTTTTGAGG TTGCAGATAG AACTGACGAA GTCAGCTCAA AACAGG 7186 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14273 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
CTGAAAATTC TAAAAAATTT ATAAGTAAGG AATTAATTAG TTATTTTTGT GATAAAGTTT 



60 
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ATGATGAAAT ATTTGTTGAA GAGGTAGTTC CGCACGTTTT TCTGCCATAT GAATCTGACT 120 

TACTTCTTAT TTTACCAGCT ACGGCAAATG TGATTGGCAA AATTGCTAAT GGTATTGCTG 180 

ATGATTTAGT TACAGCAACT GTTTTAAACT TTAATAAAAA AATAATTTTT TGTCCCAATA 240 

TGAACTCTAC TATGTGGGAC AATCACATAG TTCAAAGAAA TGTATCAATT CTAAAGGAGT 300 

TGGGACATAT ATTTTTATTT GAGTCTAAAA AAACATATGA GGTAGGATTG CGTAAAGCAA 360 

TAGATTCAAC ATGTTCAATG TTACAACCAC AGTCGTTAGT AAAAGAACTT ATCAAATTAG 420 

AAAATATTGT CCTTGAAGAG GGACATTAAA AACTACTGAG AATATTAATG AGGGGAAAAA 480 

ATGGAAAATT CATCAATCGA TGTAGATATG CTGTTGGAAG AATTGACACA AGAAGCAATG 540 

GTCGTTGTTG CTGTTGATAA GGACTGTTAA TTTAAACTTA TGGCAATATA TGAAAGGTTA 600 

CTGGATGTTT TAAATTATGC AGGCAGTAGC CTTTTATTAT ATACAAATGG ATAAAGTAAG 660 

GATAATACAA TGATTAATAA AAAAATACAA CAAGTTGTTT TGGAATCATT ACAGAATTTT 720 

TTGAATGGGA ACTTCATTTC GCCTTGTGTA GTCTATGATT TTGGCTTGCT GGAAACTGTA 780 

CTTGATGAAT TTAAAAATCA AATTCCTGTA ACATTCAATT ACCAACTTTT TTATGCCGTT 840 

AAAGCAAATT CAAATGAGAA GATACTTGAA TTCTTAGTAG ATAAAATTGA TGGAGTTGAT 900 

GTGGCGTCAT TATCTGAATT AGATGTGGCT AAAAAATTTT TCCCACCAAC TCAAATTTCT 960 

GTTAATGGTC CCGCATTTTC TTATGAAACT TTATATAATC TGATTAAAAA ACAATATAAA 1020 

GTTGATATTA ACTTTTTGGA ACATCTTCAA CAATTTTCCC CAAAAGAATC TGTTGGAATA 1080 

AGAGTAACGG AGCCAGATGA ACTTAATAAT CGTATGAGTC GATTTGGAAT AAATATTTGC 1140 

AGTGATAATT GGACTAGTAA TTTACAAAAT CCTTTAATTA CACGACTGCA TTTTCATTTT 1200 

GG AGAAAAAG ATGATAAATT TATTGTTAAG TTAGATAAAA TATTATTTAA GTTACAAGAA 1260 

ATTAATAAAC TTAGAGAGGT TAGAGAAATA AATCTTGGAG GCGGTTTTAT GAAATTATTT 1320 

ATGGAAAATC GTTTGAAAGA ATTTTTTCTA TCACTTATGG AAATCTATAA AAAGTACGAT 1380 

ATTGATAGTA CTGTGACTAC AATAATAGAA CCAGGTAGTG CAATTACTTC ATTTTCTGCC 1440 

TATATGATTA CTAGCCCAGT TAATGTTAGT GAGGTGAATG AGCAGCAGGT TATCACGTTA 1500 

GACACATCAA TATACACCAA TACATTATGG TTTGTTCCGC ATATTATTAC AACGTTAAAT 1560 

TCAAGTAGTA AAGAGCGTTA TAGTACTATT CTCTATGGTA ATACCTGTTA TGAACATGAC 1620 

AAGTATAAAA TGAAAGTTTC GCTTCCAAGG TTAACTCAAA ATAGCAGTAT AGTGTTTTTT 1680 

CCTGTAGGAG CTTATATAAA AAGCAATCAT TCAAATTTAC ATCGTAATGA TTTTATGCGG 1740 

GAGGTATATT TGTGGACAAA AAACTTGACA TATTAGATAA AGTTAAGGAA TATTTAGGAA 1800 

ATAAAACTAC TCAAATTCTG GATAATCAAT ATAAAGAATT TTTGAAACTT AATGATATAA 1860 
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GGCGAGCGTT TGGTATTTCA GAAAAAGTAT TAAACAATTC TTTTAATTTT ACGAGTAAAG 1920 

AATTTAATGA -TTTAATTAAT AACGAAAATT ATTTATTCGA ATATGCATGT AGAATTAGAG 1980 

AGGAATGGAG AAAAAAATGC TTTAATCATT CTTATCGTTT TCTATGCTCA CCTATAATTA 2040 

CAGATGATTT TCTTAACACG AAGACATTGA GAAGTAGCCA AATTGAATAT AAATATGAGC 2100 

GATATTTATC GAAAAGTTCG ATAGGCGATA GAGCGGTTGA TGGCTTTGTT TCCTTCAATA 2160 

CTTTAACAGC TAATGGTATG TCTGCTATTA AACTATGTCT TGAGATATTA AACTCTATTT 2220 

TCTTCAAGAA GAAGATTGAT TTATTATATT CAACCGGATA TTATGAAACA AGATTTTTAT 2280 

TAAATAATCT TGCTAAATCA GGTATTAGTT GCTATGAGGT AAGTAATTGT GAATTGGATA 2340 

AAGATAAATT TTATAATGTA TTCATGATGG AACCCAATCG AGCCGATTTA ACATTACAAA 2400 

AAACTGATTT CAAGATAGTA GAATATTTTG TTAAGTATAA AAATAATTCA ATAAAAGTCG 2460 

TTATTTTAGA TATTTCATAT CAAGGTTCTA ATTTTAAATT AGTAGAATTT TTAGAGAAAT 2520 

TTAAATTTGC GAATGTAATT ATTTTTGTGG TACGATCTTT GATAAAATTA GATCAAATGG 2580 

GATTAGAATT GACAAATGGG GGAATAATAG AAGTGTTTAT TCCTAATCAT TTGAGAAAGT 2640 

TGAAAAATTT TATTGAAGAG GAATTCAATA AATTTAGAAA TTCTCACGGA CCTAATCTAA 2700 

GCCTCTATGA ATACTGTTTG CTTGATAATT CTTTAACTTT AAAAAATGAT TGGAACTATT 2760 

CTGATTTAGT TATGAAATTT ACGAGTAATT TTTATGCTGA TATAAAAGAC TTGTTCATGG 2820 

AAAATTCTGA TATTGAAATC ATCCATGAAG AGGGAGTACC TTTTGTATTT TTAGATTTAA 2880 

TAGGTGAAGG TAAAAAAGAA TATGAAATGT TTTTTCAATG GTTAAACTTC TTTTACAAAC 2940 

AGCTTGGAAT CACATTGTAT GCTAGAAATA GTTTTGGGTT TCGGAATCTA ACAGTAGAGT 3000 

ATTTTGGAAT TATTGGGACA GAAAGATATA TATTTAAGAT TTGTCCAGGT GTTTATAAAG 3060 

GGTTAAGTTA TTATTTGATG AAATTTTTAT TAAAATCTTT TTCAAATGAA TATTTAAAAA 3120 

CTACTGATGA GGTTAATAGA TGAAAAATTT GATAAAGTTG CTAATAATTA GATTGATTGT 3180 

TAACTTAGCA GACAGTGTAT TTTATATAGT AGCATTGTGG CACGTTAGCA ATAATTATTC 3240 

TTCGAGCATG TTCTTAGGAA TATTTATTGC AGTAAATTAT CTACCGGATT TGTTACTAAT 3300 

CTTTTTTGGA CCAGTTATTG ACAGAGTAAA TCCGCAAAAA ATTCTTATAA TATCAATTTT 3360 

GGTTCAATTA GCAGTGGCTG TAATATTTTT ATTATTATTA AACCAAATAT CATTTTGGGT. 3420 

GATAATGAGT CTAGTGTTTA TTTCAGTAAT GGCTAGCTCC ATAAGTTACG TGATAGAAGA 3480 

TGTGTTGATT CCTCAAGTGG TAGAATATGA TAAGATTGTA TTTGCAAATT CTCTTTTTAG 3540 

TATTTCGTAT AAAGTATTAG ATTCTATTTT TAATTCATTC GCATCATTTT TACAGGTGGC 3600 
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AGTAGGATTT ATTTTATTGG TTAAGATAGA TATAGGCATA TTTTTACTTG CTCTATTTAT 3660 

ATTGTTGTTG TTAAAATTTA GAACTAGCAA TGCGAATATA GAAAACTTCT CTTTCAAATA 3720 

TTACAAGAGA GAAGTGTTGC AAGGTACAAA GTTTATTTTA AATAATAAAT TATTATTTAA 3780 

AACCAGTATT TCTTTAACGC TTATAAACTT TTTTTATTCA TTTCAGACAG TAGTTGTACC 3840 

GATTTTTTCT ATTCGATATT TTGATGGTCC GATTTTTTAT GGTATTTTTT TAACTATTGC 3900 

TGGTTTGGGT GGTATATTGG GAAATATGCT AGCGCCAATC GTAATAAAAT ATTTAAAATC 3960 

GAATCAAATT GTTGGTGTAT TTCTTTTTTT GAACGGCTCA AGTTGGTTAG TAGCAATTGT 4020 

TATAAAAGAC TATACTTTAT CACTTATTTT ATTTTTCGTT TGTTTTATGT CTAAAGGAGT 4080 

CTTCAATATT ATTTTTAATT CGTTGTACCA ACAAATACCT CCACATCAAC TTCTTGGTAG 4140 

GGTAAATACT ACCATTGATT CTATTATTTC TTTTGGAATG CCAATTGGTA GTTTAGTTGC 4200 

AGGAACGCTT ATTGATTTGA ATATTGAATT AGTGTTAATT GCTATTAGCA TACCTTATTT 4260 

TTTGTTTTCT TATATTTTTT ATACGGATAA TGGATTGAAA GAATTTAGTA TATATTAGAA 4320 

ATGTTTATGT TCATTCAAAA GCATAATGAC TATAACTGAA AAAGAAAAGT GATATCTTTA 4380 

AGGTTGTTCT TCTTGGTGGT GAGATTCGTG AGACAACCCA AGCTTTTGTC GGAAAGATTA 4440 

CCAATGCTTT GATGGATAGG ATGTACTTTA GCAAGATGTT TTTAGTGGTA ACGGTATCGT 4500 

GGATGGACGT GTAATAACCT CTTCTTTCGA GGAGTATTTT ACTAAAAAAC TAGCCTTGGA 4560 

GCGTTCCCCA GAAACGGACT TACTCATTGA CTCTTCAAAG ATTTGGGGAG AAGATTTTGC 4620 

TTCATCTGTT CCTTGAAAAA AGTCACAGCA GTCATCACAG ACGATAGTAC TGAACAAAAC 4 680 

TATGAAGAGT TAGAAATTTA TACGCAGGTG ATTGTATAAA GGATCTGGAA ATAGATAAGA 4740 

AGTTGATTAG TATTGACCTA GGTGGTACAA ATATTAAGAT TACTGTTCTT TCAAATGACG 4800 

GTGAGATTGA AACTTTGTGG AGTATTACAA CAGATACAAG TGAGAAAGGT TCTCAAATTA 4860 

TATCGGACAT CATCAGTTCT ATTAAAAATA AATTGACCGA ACGGAATATT CCTGATAGCG 4 920 

ACCTTCTTGG AATCGGTATG GG AAGTTGCT CATCATACTT TCCTTGTAAA TCATAGGGGC 4980 

TATAAACTCT CCGTCTACTT GTCCTGCAAC AATTGAAGTC TGCTCAAAAC GCCGTCCGCT '3040 

AATCTTTTCA TAGACTTTCT CCCTTTTAGG AGCCTAGCTT TCTAGTTTGT TCTTTGATTT 5100 

TTATTGAGTA TACCACTATT TTACTCCCTC TGGCAAGGGA CTTTGTCTAT GTGGAGGGAT 5160 

TGGGCTCCTA TGTGGTGGAG CTTTTCTGTT CTTTCTGAAA TATGGTATAA TAGCACTAAT 5220 

CAATTTCTAG GAAAATAGAT ACAGAAAGGG GCTGAAAGAT GTCTCATATT ATTGAATTGC 5280 

CAGAGATGCT GGCAAACCAA ATCGCGGCTG GAGAGGTCAT TGAACGTCCT GCCAGTGTGG 5340 

TCAAAGAGTT GGTAGAAAAT GCCATTGACG CGGGCTCTAG TCAGATTATC ATTGAGATTG 5400 
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AGGAAGCTGG TCTCAAGAAG GTTCAAATCA CGGATAACGG TCATGGAATT GCCCACGATG 5460 

AGGTGGAGTT GGCCCTGCGT CGCCATGCGA CCAGTAAGAT AAAAAATCAA GCAGATCTCT 5520 

TTCGGATTCG GACGCTTGGT TTTCGTGGTG AAGCCTTGCC TTCTATTGCG TCTGTTAGTG 5580 

TCTTGACTCT GTTAACGGCG GTGGATGGTG CTAGTCATGG AACCAAGTTA GTCGCGCGTG 5640 

GGGGTGAAGT TGAGGAAGTC ATCCCAGCGA CTAGTCCTGT GGGAACCAAG GTTTGTGTGG 5700 

AGGATCTCTT TTTCAACACG CCTGCCCGTC TCAAGTATAT GAAGAGCCAG CAAGCGGAGT 5760 

TGTCTCATAT CATTGATATT GTCAACCGTC TGGGCTTGGC CCATCCTGAG ATTTCTTTTA 5820 

GCTTGATTAG TGATGGCAAG GAAATGACGC GGACAGCAGG GACTGGTCAA TTGCGCCAAG 5880 

CAATCGCAGG GATTTACGGT TTGGTCAGTG CCAAGAAGAT GATTGAAATT GAGAACTCTG 5940 

ACCTAGATTT CGAAATTTCA GGTTTTGTGT CCTTGCCTGA GTTGACTCGG GCTAACCGCA 6000 

ATT AT AT C AG CCTCTTCATC AATGGCCGTT ATATTAAGAA CTTCCTGCTC AATCGTGCTA 6060 

TTTTGGATGG TTTTGGAAGC AAGCTTATGG TTGGACGTTT TCCACTGCCT GTCATTCACA 6120 

TCCATATCGA CCCTTATCTA GCGGATGTCA ATGTGCATCC AACTAAGCAA GAGGTGCGGA 6180 

TTTCCAAGGA AAAAGAACTG ATGACTCTGG TTTCAGAAGC TATTGCAAAT AGTCTCAAGG 6240 

AACAAACCTT GATTCCAGAT GCCTTGGAAA ATCTTGCCAA ATCGACCGTG CGCAATCGTG 6300 

AGAAGGTGGA GCAAACTATT CTCCCACTCA AAGAAAATAC GCTCTACTAT GAGAAAACTG 6360 

AGCCGTCAAG ACCTAGTCAA ACTGAAGTAG CTGATTATCA GGTAGAATTG ACTGATGAAG 6420 

GGCAGGATTT GACCCTGTTT GCCAAGGAAA CCTTGGACCG ATTGACCAAG CCAGCAAAAC 6480 

TGCATTTTGC AGAGAGAAAG CCTGCTAACT ACGACCAGCT AGACCATCCA GAGTTAGATC 6540 

TTGCTAGCAT CGATAAGGCT TATGACAAAC TGGAGCGAGA AGAAGCATCC AGCTTCCCAG 6600 

AGTTGGAGTT TTTCGGACAA ATGCACGGGA CTTATCTCTT TGCCCAAGGG CGAGATGGAC 6660 

TTTACATCAT AGATCAGCAC GCTGCTCAGG AACGGGTCAA GTACGAGGAG TACCGTGAAA 6720 

GCATTGGCAA TGTTGACCAA AGCCAGCAGC AACTCCTAGT GCCCTATATC TTTGAATTTC 6780 

CTGCGGATGA TGCCCTGCGT CTCAAGGAAA GAATGCCTCT CTTAGAGGAA GTGGGCGTCT 6840 

TTCTAGCAGA GTACGGAGAA AATCAATTTA TTCTACGTGA ACATCCTATT TGGATGGCAG 6900 

AAGAAGAGAT TGAATCAGGC ATCTATGAGA TGTGCGACAT GCTCCTTTTG ACCAAGGAAG . 6960 

TTTCTATCAA GAAATACCGA GCAGAGCTGG CTATCATGAT GTCTTGCAAG CGATCTATCA 7020 

AGGCCAATCA TCGTATTGAT GATCATTCAG CTAGACAACT CCTCTATCAG CTTTCTCAAT 7080 

GTGACAATCC CTATAACTGT CCTCACGGAC GTCCTGTTTT GGTGCATTTT ACCAAGTCGG 7140 
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ATATGGAAAA 


GATGTTCCGA 


CGTATTCAGG 


AAAATCACAC 


CAGTCTCCGT 


GAGTTGGGGA 


7200 


AATATTAAAA GTATAAAAAA GTCTGGGAAA AATTTTCAAA ATCAAAAAAA 


CGCATAAAAT 


7260 


CAGGTGTTCA AAAACCTTGA TTTTATGCGT TTTATCATGG AAATAGTTAC 


TTCATTTTTT 


7320 


CCTAATTCTT 


TTCGAAACTC 


TTTTTAAACG 


ACGTCAGTTT 


TATCAGTAAT 


CTCAAAACAG 


7380 


TGTTTTGAGC 


TAATTTTGCC 


AGTTTTGTCT 


GTAACATCGA 


AGTTGTGTTT 


TACCACTCTG 


7440 


CGACTGGTTT 


CCTAGTTTGC 


TCTATGATTT 


TCACAGAGCA 


TTAAATTGCG 


ATTTTGCCAA 


7500 


GTTTCTTTAT 


TCGTCTAAAA 


GTAGAGTCTG 


TTCTATGCGT 


CTAATGTACG 


AATCAGGTTG 


7560 


ACCATTTCAA 


TAGCTCCTTG 


TGCACACTCA 


GAACCCTTAT 


TTCCTGCTTT 


AGTACCAGCT 


7620 


CGTTCTATGG 


CTTGTTCAAT 


TGTATCTGTC 


GTTAGCACAC 


CAAACATAAC 


AGGAATTTCC; 


7680 


CTATTTAAAC 


TGATTTGGGC 


GATTCCCTTA 


GATACCTCGC 


TACATACATA 


ATCATAATGA 


774fl 

f f *i\J 


CTTGTATTCC 


CTCTAATGAC 


AGCTCCCAAG 


CAGATAATTG 


CATCATATTT 


TTTACTTTTT 


7800 


GCCATTTTTG 


ATGCAATCAG 


TGGTATTTCA AAAGCTCCTG 


GAACCCAGGC 


TACCTCTATA 


7860 


TCTTTCTCGT 


TTACATTCTC 


TCTTTTGAGA 


TTATCTAGTG 


CTCCAGATAA 


TAATTTTGAA 




GTTATAAATT 


CATTAAATCT 


CGCTACAACA ATACCTATTT 


TAATATTGTT 


TGCTACTAAA 


7980 


TTACCTTCAT 


AAGTGTTCAT 


TTATTTTTCC 


TCCATATTTA 


AAATGTGACC 




8040 


TTCTTTGTTT 


CTAAATAAAA 


ACTATCGTAA 


GGATTGGCTT 


CTATTTCGAT 


TGATATTCTA 


8100 


CTGGAAATGG 


TAATTCCATA 


TTTTTCTAAC 


TGTTCAACCT 


TGTCAGGATT 


ATTTGTCAGT 


8160 


AAATGAAGTG 


ACTGAAGTCC 


CAGATCTTTA 


AGCATTTTTG 


CTCCAATATG 


ATATTCTCTT 


8220 


AAATCACCTT 


CAAAGCCTAA 


TGCAAGATTG 


GCATCAAGCG 


TATCCATGCC 


TTGATCTTGT 


8280 


AAATGATAGG 


CTTTTAATTT 


ATTGATAAGT 


CCAATTCCTC 


GTCCCTCCTG 


TCGCAAGTAA 


8340 


AGTAAGACAC 


CCGAACCATT 


CTCAACAATC 


ATTTTCATAG 


CCTTATCGAA 


TTGCTGTCCA 


8400 


CAATCGCAAC 


GTAAAGAGCC 


TAAAACATCT 


CCTGTTAAAC 


ATTCGGAGTG 




8460 


AATACATTGG 


CTTCATCCTC 


TATATTTCCC 


ATAATAAGAG 


CAAGATGATG 


TTCCCCATTT 


8520 


AGTTTATCTA 


TATAGCTAAT 


TGCTTTGAAA 


TTACCGTATC 


TAGTAGGCAT 


ATTGACAGTT 


8580 


GAAACTCGTT 


CTACCAGCTG 


ATCATATACT 


TTTCTATATT 


CTTGTAATTC 


TTTGATGGTA 


8640 


ATTAGTGGAA 


TGTTGTGTTT 


TTTCGAGAAC 


TGAATTAAAT 


CATCTGTTCT 


CATCATTTTG 


8700 


CCATCATGAT 


TCATTATTTC 


ACAACATAGG 


CCACACTCTT 


TTAGTCCAGC 


TAATTTTAAT 


8760 


AAATCAACAG 


TTGCTTCTGT 


GTGTCCATTT 


CTTTCTAGGA 


CACCACCTTT 


TTTTGCAATT 


8820 


AAAGGAAACA 


TGTGTCCTGG 


CCTGCGAAAA 


TCAGAGGGTG 


TTATATCTTC 


AGCTACACAC 


8880 


ATACGTGCGG 


TCAGTCCTCT 


TTCCTCGGCA 


GAAATACCTG 


TGGTCGTTTC 


TTTATAATCA 


8940 
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ATTGAAACTG 


TAAAAGCAGT 


CTTATGATTA 


TCTGTATTCT 


TTTCAACCAT 


AGGTGAAAGC 


9000 


ATTAATTGAT 


TAGCTAAACT 


TTCGCTCATA 


GGCATACAAA 


TTAATCCTTT 


GGCATAAGTA 


9060 


GCCATAAAAT 


TAACATTTTC 


TGTTGTAGCT 


GCTTGTGCAG 


AACAAATTAA 


GTCTCCTTCA 


9120 


TTTTCTCTAT 


CCTTGTCGTC 


TATAACAAGA 


ACAAGTCGTC 


CCTTCTGCAA 


TGCTTCTAAT 


9180 


GCTTCTTGTA 


TTTTTCGATA 


TTCCATTGAC 


TGATTATCCT 


TTCTGCTAAA 


ATCCATTTTG 


9240 


ATATAATAGT 


TCCTTAGATA 


TTTCTGATTT 


TGGAGAGTTA 


TCCATCAGTT 


TTTGCACATA 


9300 


TTTACCTAAG 


ATATCATTTT 


CAAGATTTAC 


TGTACTCCCG 


ACTTGTTTAC 


TCTTAAGAAT 


9360 


GGTTTGTTCC 


AAGGTATGAG 


GGATAACAGA 


TACTGAAAAG 


TTTACTTTGG 


AGACTTTAGC 


9420 


GACAGTCAGA 


CTAATGCCGT 


CAATTGTAAT 


AGATCCTTTT 


TCAACTATTA 


AATCTAAAAT 


9480 


TTCTTTTTGT 


GTGTTGATTT 


GATACCATAC 


AGCATTATCA 


TCTTTTTTTA 


TTGACGAGAT 


9540 


TTTTCCTGTA 


CCATCAATGT 


GTCCTGTAAC 


GACGTGACCC 


CCAAGTCGAC 


CGTTGACAGA 


9600 


TAAGGCTCTT 


TCTAGATTCA 


CCTCACTTCC 


ATGTTTTAAT 


AGAGTAAGAG 


CTGTTCGACT 


9660 


CCATGTTTCA 


TTCATTACAT 


CAACTGTAAA 


GGATTGATGA 


TTGAAATGAG 


TAACTGTAAG 


9720 


ACAGATACCA 


TTTACTGCTA 


TACTATCGCC 


TAAATGGATA 


TCCGTTAATA 


TTTTTGAGGC 


9780 


TTTAATTGAT 


AGTTTACAAT 


TACGAGAGTC 


TTTCTGTATT 


CTTTCAACTT 


TTCCGATTTC 


9840 


TTCAATTATT 


CCTGTGAACA 


TGGATAAATC 


ACTTCACTTT 


CTATGAGATA 


GTCATTTCCT 


9900 


ATTTGAGAAA 


ATGCATAAGG 


TTTCAATCTA 


ATAGCGTCAT 


TTGGCAAAGA 


AATACCTTCA 


9960 


CCTCCGACAG 


GAAACTTGGC 


ACTACCTCCA 


AAAACTTTTG 


GTGCAATATA 


TATTTTCAGC 


10020 


TCATCAACAA 


TTTGTTGTTC 


CAAAGCACTC 


CAATTCATTA 


GACTGCCCCC 


TTCTAGAACT 


10080 


AGGCTATCAA 


TCTGCATGTT 


TCCTAGATGT 


TGCATTAAAC 


TCGATAAGTC 


TATATGATTG 


10140 


CCTTTTTTCT 


TTATGGAAAG 


TATTTCACAG 


CCATGATTTT 


GATATAGCTT 


CATTTTATTT 


10200 


TTGTCTTCAG 


AGGAAGTGGC 


AATGTAAGTT 


TTAATATCAT 


TTGCTGTTTT 


TACGATTTTA 


10260 


GAGGTAAGAG 


GAGTTCGTAA 


ATGTGTATCG 


CATATGATAC 


GGATAGGATT 


TTTCCCTTCC 


10320 


TCCAATCTAC 


ATGTCAGCAA 


AGGATCGTCT 


TGAATAACAG 


TATTGACTCC 


CACCATAATT 


10380 


GCACTAACAT 


GGTGTCGTAA 


CTGATGCACA 


TGCTTTCTTG 


CTTCPTCTTC 


AGTAATCCAT 


10440 


TTGGATTGAT 


TTGTTTTAGT 


GGCTATTTTT 


CCATCCATTG 


ACATTGCATA 


TTTCATAAAA - 


10500 


ACATAGGGTA 


CATGCTGGGT 


AATATACTTT 


CTAAAACTTT 


TTATTAAGTT 


AAGACACTCA 


10560 


TTTTCTAAAA 


TTCCAACAGT 


AACTTGAAGA 


TTATTTTCCT 


CAAGTATCTT 


TACTCCTTTT 


10620 


CCAGATACAA 


TAQGATTACA 


GTCTAGGCTT 


CCAATGACTA 


CTCTTGTAAT 


ACCACTATCG 


10680 
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ATTATAGCAT 


CTATACAGGG AGGTGTTTTC 


CCGAAGTGAC 


AACAGGGTTC 


AAGTGTTACA 


10740 


TAAAGCGTCG 


CTCCGACAGG 


GGATTCTCTA 


CAGTTTTTAA 


GAGCATTTCT 


CTCAGCATGT 


10800 


GGGCCACCAA 


AAAACTCATG 


ATAACCTTGT 


CCGATAATGT 


GATTATCTTT 


TACAATAACT 


10860 


GCGCCGACCA 


TAGGATTGGG 


ATTGACGTAA 


CCAGCCCCTT 


TTTGTGCCAG 


TTTTATTGCT 


10920 


AATTTCATAT 


ATTTTGAATC 


GCTCATCTCG 


CTACCTCCAA 


AAAAATATAC 


CTTGAATAGG 


10980 


GGACTACTCA 


AGGCATACAA 


AAGAAAACTT 


ATGCGATTAA 


CAAAAATGCT 


CTGAAATGAC 


11040 


AAGTAATCAT 


TTCAGAGCAC 


GCAAAAAGCA 


CAAATATACT 


TTTAPCTTCT 


TTCATCCAGA 


11100 


CTATACTGTC 


GGCTTTGGAA 


TTTCACCAAA 


TCATGCCTTT 


CGGCTCGTGG 


GCTATACCAC 


11160 


CGGTAGGGAA 


TTTCACCCTG 


CCCTGAAGAT 


AGTTATTCAA 


TTACAGATGA 


TTATAGTACT 


11220 


TAATTTTGAA 


TATGTCAACA 


GATAAATACC 


GATTGTTTTT 


GATATACTGT 


ATTTGTGATA 


11280 


ATCGATTCTC 


GCTCCTCGGA 


TAAAGAAAAT 


ATGATATACT 


AGATAAACGA 


AATAAGAGAG 


11340 


AAGGAATACT 


ATGTACGCAT 


ATTTAAAAGG 


AATCATTACC 


AAAATTACTG 


CCAAATACAT 


11400 


TGTTCTTGAA 


ACCAATGGTA 


TTGGTTATAT 


CCTGCATGTG 


GCCAATCCTT 


ATGCCTATTC 


11460 


AGGTCAGGTT 


AATCAGGAGG 


CTCAGATTTA 


TGTGCATCAG 


GTTGrGCGTG 


AGGACGCCCA 


11520 


TTTGCTTTAT 


GGATTTCGCT 


CAGAGGATGA 


GAAAAAGCTC 


TTTCTTAGTC 


TGATTTCGGT 


11580 


CTCTGGGATT 


GGTCCTGTAT 


CAGCTCTTGC 


TATTATCGCT 


GCTGATGACA 


ATGCTGGCTT 


11640 


GGTTCAAGCC 


ATTGAAACCA 


AGAACATCAC 


CTACTTGACC 


AAGTTCCCTA AAATTGGCAA 


11700 


GAAAACAGCC 


CAGCAGATGG 


TGCTGGACTT 


GGAAGGCAAG 


GTAGTAGTTG 


CAGGAGATGA 


11760 


CCTTCCTGCC 


AAGGTCGCAG 


TGCAAGCAAG 


TGCTGAAAAC 


CAAGAATTGG 


AAGAAGCTAT 


11820 


GGAAGCCATG 


TTGGCTCTGG 


GCTACAAGGC 


AACAGAGCTC 


AAGAAAATCA 


AGAAATTCTT 


11880 


TGAAGGAACG 


ACAGATACAG 


CTGAGAACTA 


TATCAAGTCG 


GCCCTTAAAA 


TGTTGGTCAA 


11940 


ATAGGAGCAG 


AGAATGACAA 


AACGTTGTTC 


GTGGGTCAAG 


ATGACCAACC 


CGCTCTACAT 


12000 


CGCCTATCAT 


GATGAGGAGT 


GGGGCCAGCC 


CCTCCATGAT 


GACCAAGTAT 


TCTTTGAGIT 


12060 


GTTGTGTATG 


GAAACCTATC 


AGGCAGGCCT 


GTCTTGGGAA 


ACGGTACTCA 


ACAAACGCCA 


12120 


AGCTTTCCGA 


GAAGTCTTTC 


ATAGCTATCA 


AATTCACTCA 


GTCGCAGAGA 


TGACTGACAC 


12180 


TGAATTGGAA 


GCCATGCTGG 


AGAATCCAGC 


TATCATTCGA 


AATAGAGCCA 


AGCTTTTTGC 


12240 


TACACGCGCT 


AACGCCCAAG 


CCTTTCTACA 


GTTACAGGCA 


GAGTACGGCT 


CTTTTGATGC 


12300 


CTATCTTTGG 


TCTTTTGTTG 


AGGGGAAAAC 


TGTCGTTAAC 


GATGTTCCTG 


ATTATCGCCA 


12360 


AGCGCCAGCT 


AAAACACCCT 


TATCTGAGAA 


ATTAGCCAAA 


GATCTCAAAA AACGAGGCTT 


12420 


CAAGTTCACA 


GGCCCAGTCG 


CCGTATTGTC 


TTTTCTACAG 


GCTGCAGGGC 


TAGTTGATGA 


12480 
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CCACGAGAAT GATTGTGAGT GGAAAGGTCT TAAATGATGT CTAACAAAAA TAAGGAAATT 12540 

CTGATTTTTG CGATTCTCTA TACAGTCCTC TTTATGTTTG ATGGCGTTAA ATTGCTGGCT 12600 

TCTTTAATGC CATCTGCCAT TGCAAATTAT CTTGTTTATG TAGTTTTAGC TCTATATGGC 12660 

TCCTTCTTGT TCAAGGATAG ATTGATCCAA CAATGGAAGG AGATTAGAAA GACTAAAAGA 12720 

AAATTCTTCT TTGGAGTCTT AACAGGATGG CTCTTTCTCA TTCTGATGAC TGTTGTCTTT 12780 

GAATTTGTAT CAGAGATGTT GAAGCAGTTT GTGGGACTAG ATGGACAAGG TCTAAATCAG 12840 

TCTAATATTC AAAGTACCTT TCAAGAACAA CCACTACTGA TAGCTGTTTT TGCTTGTGTC 12900 

ATTGGACCTC TGGTAGAAGA ATTATTTTTC CGTCAGGTCT TATTGCATTA CTTGCAGGAA 12960 

CGGTTGTCAG GTTTACTAAG CATTATTCTG GTAGGACTTG TTTTTGCTCT GACTCATATG 13020 

CACAGTTTGG CTCTATCAGA GTGGATTGGT GCAGTTGGTT ACTTAGGTGG AGGCCTTGCC 13080 

TTTTCTATTA TTTATGTGAA AGAAAAAGAG AATATCTACT ATCCCCTACT TGTTCACATG 13140 

TTAAGCAACA GCCTCTCCTT AATCATTTTA GCTATCAGTA TAGTAAAATG AAATGAGAAC 13200 

AGGACAAATC GATTTCTAAC AATGTTTTAG AAGTAGAGGT GTACTATTCT AGTTTCAATA 13260 

TACTGTAATA TGTGATGAAA ATGCCAGTAA TGATACCGAG AAAAAAGCTG AGAAACTTTT 13320 

CCCAGCTTTA TTTGTTATAG TCAAAGAGAA TGACTTGTTC CTGTGCATCT ACATGAGCAT 13380 

GGACCCCAAA GGGTACAATT GCTCTTGGAG TTGCGTGGCC GACATTCAGA TTATAGACAA 13440 

TCGGGATATT GCTGTCAATG ATATCCAATA GTGCCTCTTT ATAGTCGTCA TGGAAAGTTT 13500 

CATCCATAGG TTTTCCGACC AAGAGTCCAT TGATGACCGC GAATATGCCA GTGTCCTTTA 13560 

AAGTTAGCAA CATCTTTTTG AAGTCTTCTG GCTTAGGCTT TTCTTCGCTT GTTTCGAGCA 13 620 

AGAGGATTTT CCCTTCCCAG TCTGACAAGT CAGGGAAAAG TTTGTATTTT TGGCAGAGTT 13 680 

CCGTGCTATC TGCGTATCGA GAGTTGTCAA AGATATCGTA GAGGGATTCG AGGCAACCAC 13740 

CGAGGATTTT CCCCTCGAAC TGGGCACTTC CTTGCAACAA GTCAAAACCT GTATTTGTAT 13800 

GACTGACACG AGGTGTTCCC AGGGCCGTGG GACTAAAATC AGTTCGTTCC TCATACCAAA 13860 

CGTCACTAGG GCGGATTTCT GAAATTCTTC CCGTCTCAAT CAATTCTTTA AAGTAGTGAA 13920 

GGCTATAGGC TAGCATTTCT TTGTCTAATT CACAAATGTC TGCTAAAAAG GATTGACCAT 13980 

AAAAAGTCTT GATTCCTAAT TTATGCAACA TGAGGTGGTT CATGGTTGTA TCCGAGAAGC- 14040 

CAAGAAAAAT TTTTTGCTTG ATAACCTTTT GGAGTTGGTC ATTTTCAAAA AGATAAGGTA 14100 

GCAAGCGATA GGTATCGTCT CCACCGATGG CACATAGGAT CATGTCGATG CTATCATCAG 14160 

AAAAGGCATG AATCAAATCC TCTGCACGAG CTTCAGGATG GTCCTTGATA AAGTCTAATC 14220 
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CTTTTAACGA ATGGGGCAAA AAGATGGGAT TGGTCCCAGA TCCTTGAGAC GTT 14273 
(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 9828 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

GTGAAGTGCG GCAAAAGGTG CAAGTGATGA GCTCAGGTTC TTTAGCTCTT GACATTGCCC 60 

TTGGCTCAGG TGGTTATCCT AAGGGACGTA TCATCGAAAT CTATGGCCCA GAGTCATCTG 120 

GTAAGACAAC GGTTGCCCTT CATGCAGTTG CACAAGCGCA AAAAGAAGGT GGGATTGCTG 180 

CCTTTATCGA TGCGGAACAT GCCCTTGATC CAGCTTATGC TGCGGCCCTT GGTGTCAATA 240 

TTGACGAATT GCTCTTGTCT CAACCAGACT CAGGAGAGCA AGGTCTTGAG ATTGCGGGAA 300 

AATTGATTGA CTCAGGTGCA GTTGATCTTG TCGTAGTCGA CTCAGTTGCT GCCCTTGTTC 360 

CTCGTGCGGA AATTGATGGA GATATCGGAG ATAGCCATGT TGGTTTGCAG GCTCGTATGA 420 

TGAGCCAGGC CATGCGTAAA CTTGGCGCCT CTATCAATAA AACCAAAACA ATTGCCATTT 480 

TTATCAACCA ATTGCGTGAA AAAGTTGGAG TGATGTTTGG AAATCCAGAA ACAACACCGG 540 

GCGGACGTGC TTTGAAATTC TATGCTTCAG TCCGCTTGGA TGTTCGTGGT AATACACAAA 600 

TTAAGGGAAC TGGTGACCAA AAAGAAACCA ATGTCGGTAA AGAAACTAAG ATTAAGGTTG 660 

TAAAAAATAA GGTAGCTCCA CCGTTTAAGG AAGCCGTAGT TGAAATTATG TACGGAGAAG 720 

GAATTTCTAA GACTGGTGAG CTTTTGAAGA TTGCAAGCGA TTTGGATATT ATCAAAAAAG 780 

CAGGGGCTTG GTATTCTTAC AAAGATGAAA AAATTGGGCA AGGTTCTGAG AATGCTAAGA 840 

AATACTTGGC AGAGCACCCA GAAATCTTTG ATGAAATTGA TAAGCAAGTC CGTTCTAAAT 900 

TTGGCTTGAT TGATGGAGAA GAAGTTTCAG AACAAGATAC TGAAAACAAA AAAGATGAGC 960 

CAAAGAAAGA AGAAGCAGTG AATGAAGAAG TTCCGCTTGA CTTAGGCGAT GAACTTGAAA 1020 

TCGAAATTGA AGAATAAGCT GTTAAAGCAG TGGAGAAATC CGCTACTTTT TCGATTTTTG 1080 

ATTCAAGTTT TTAGATTATA TATAGTAGCT TGAAATAAGA TATGAACAAC TCTATTAGGA 1140 

AAGTCAAATT AATTTCTAGA AATGTTTTAG CAGCTACAGC GTACTATTCC AAACTCAACC 1200 

AACTATAATA GATCGAAACT AGAATAGTAC ATATCTACTT CTAAAACATT GTTAAAAATC 1260 

GATTTGACTT TCCTTATTTC ATTCCGCTAT ATATAGTTTG CTGTTTCTTG TCGCTCCTCT 1320 

GGAAAGCTGA TATAATAGCT TTATGAATAA AAAACGAACA GTGGACCTGA TACATGGTCC 1380 
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GATTCTTCCC TCGCTCTTAA GCTTCACCTT TCCAATTTTG CTATCAAATA TTTTTCAACA 1440 

GCTCTATAAC ACTGCTGATG TCTTGATTGT TGGACGATTT CTTGGTCAAG AATCCTTGGC 1500 

TGCAGTAGGA GCGACGACAG CGATTTTTGA CCTGATTGTA GGTTTTACAC TTGGTGTTGG 1560 

CAATGGCATG GGGATTGTCA TTGCTCGTTA TTATGGGGCT CGGAATTTCA CTAAAATCAA 1620 

GGAAGCAGTA GCAGCCACCT GGATTTTAGG TGCTCTTTTG AGCATTCTAG TTATGTTGCT 1680 

GGGCTTTCTT GGCTTGTATC CTCTCTTGCA ATACTTAGAT ACTCCTGCAG AAATTCTTCC 1740 

TCAATCTTAT CAATATATTT CTATGATTGT GACCTGTGTA GGTGTCAGCT TTGCTTATAA 1800 

TCTTTTTGCA GGCTTGTTGC GGTCTATTGG TGACAGTCTA GCAGCCCTGG GATTTCTGAT 1860 

TTTCTCTGCC TTGGTTAATG TGGTTCTGGA TCTCTATTTT ATTACGCAAT TGCATCTGGG 1920 

AGTTCAATCC GCAGGACTTG CTACCATTAT TTCGCAAGGT TTATCAGCGG TTCTCTGCTT 1980 

TTATTATATT CGTAAAAGTG TGCCAGAACT CTTGCCACAG TTTAAACATT TCAAATGGGA 2040 

CAAAAGCTTG TACGCGGATC TCTTGGAGCA AGGTTTGGCT ATGGGCTTGA TGAGTTCAAT 2100 

TGTATCTATC GGCAGTGTGA TTTTACAGTT TTCTGTTAAT ACATTTGGTG CAGTGATTAT 2160 

TAGTGCCCAG ACGGCAGCTC GACGCATTAT GACCTTTGCC CTTCTTCCTA TGACCGCTAT 2220 

TTCTGCATCA ATGACGACCT TTGCTTCTCA GAATCTAGGA GCTAAGCGAC CTGACCGTAT 2280 

TGTTCAAGGT CTTCGAATCG GCAGTCGTTT AAGTATATCC TGGGCAGTTT TTGTTTGTAT 2340 

TTTCCTCTTT TTTGCCAGTC CAGCTTTGGT TTCCTTCTTG GCTAGTTCGA CAGATGGTTA 2400 

CTTGATAGAA AATGGAAGTC TCTATCTGCA AATCAGTTCA ACCTTTTATC CCATTTTGAG 24 60 

CCTCTTGTTG ATTTATCGCA ATTGCTTGCA GGGCTTGGGG CAAAAGATCC TTCCTCTAGT 2520 

TTCTAGCTTT ATTGAACTAA TCGGAAAAAT CGTTTTTGTG GTTTTGATTA TTCCTTGGGC 2580 

AGGATATAAG GGTGTTATCC TTTGTGAACC TCTTATCTGG GTTGCCATGA CAGTTCAACT 2640 

GTACTTCTCA TTATTCCGTC ATCCCTTGAT AAAAGAAGGC AAGGCAATCT TGGCAACCAA 2700 

AGTGCAATCC TAGTTGGATT TACTGAATAA AATCCATTTC CTCTAGTGAA AATCGAAAAA 2760 

ACTTGTGTTC TCTTCTTTAG TTTGGTGTTG AAAATAGTTT AACAGACTTT TGACTTCTTT 2820 

TATATGATAT AATAAAGTAT AGTATTTATG AAAAGGACAT ATAGAGACTG TAAAAATATA 2880 

CTTTTGAAAA TCTTTTTAGT CTGGGGTGTT ATTGTAGATA GAATGCAGAC CTTGTCAGTC 2940 

CTATTTACAG TGTCAAAATA GTGCGTTTTG AAGTTCTATC TACAAGCCTA ATCGTGACTA 3000 

AGATTGTCTT CTTTGTAAGG TAGAAATAAA GGAGTTTCTG GTTCTGGATT GTAAAAAATG 3060 

AGTTGTTTTA ATTGATAAGG AGTAGAATAT GGAAATTAAT GTGAGTAAAT TAAGAACAGA 3120 
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TTTG CCTCAA GTCGGCGTGC AACCATATAG GCAAGTACAC GCACACTCAA CTGGGAATCC 3180 

GCATTCAACC GTACAGAATG AAGCGGATTA TCACTGGCGG AAAGACCCAG AATTAGGTTT 3240 

TTTCTCGCAC ATTGTTGGGA ACGGTTGCAT CATGCAGGTA GGACCFGrTG ATAATGGTGC 3300 

CTGGGACGTT GGGGGCGGTT GGAATGCTGA GACCTATGCA GCGGTTGAAC TGATTGAAAG 3360 

CCATTCAACC AAAGAAGAGT TCATGACGGA CTACCGCCTT TATATCGAAC TCTTACGCAA 3420 

TCTAGCAGAT GAAGCAGGTT TGCCGAAAAC GCTTGATACA GGGAGTTTAG CTGGAATTAA 3480 

AACGCACGAG TATTGCACGA ATAACCAACC AAACAACCAC TCAGACCACG TTGACCCTTA 3540 

TCCATATCTT GCTAAATGGG GCATTAGCCG TGAGCAGTTT AAGCATGATA TTGAGAACGG 3600 

CTTGACGATT GAAACAGGCT GGCAGAAGAA TGACACTGGC TACTGGTACG TACATTCAGA 3660 

CGGCTCTTAT CCAAAAGACA AGTTTGAGAA AATCAATGGC ACTTGGTACT ACTTTGACAG 3720 

TTCAGGCTAT ATGCTTGCAG ACCGCTGGAG GAAGCACACA GACGGCAACT GGTACTGGTT 3780 

CGACAACTCA GGCGAAATGG CTACAGGCTG GAAGAAAATC GCTGATAAGT GGTACTATTT 3840 

CAACGAAGAA GGTGCCATGA AGACAGGCTG GGTCAAGTAC AAGGACACTT GGTACTACTT 3900 

AGACGCTAAA GAAGGCGCCA TGGTATCAAA TGCCTTTATC CAGTCAGCGG ACGGAACAGG 3960 

CTGGTACTAC CTCAAACCAG ACGGAACACT GGCAGACAAG CCAGAATTCA CAGTAGAGCC 4020 

AGATGGCTTG ATTACAGTAA AATAATAATG GAATGTCTTT CAAATCAGAA CAGCGCATAT 4080 

TATTAGGTCT TGAAAAAGCT TAATAGTATG CGTTTTCTTG TGGAGATATT TCCTTCAATT 4140 

TTGCTACTAT ATTAAACAAA AATCAAAAAG CAAACTAGAA AGTTATGCTC AAATAAAATC 4200 

TAAATTTGAC AATGTAAACC GAGTCGGATA GCTTTAAGTA CTGTTTTGAG GTTGAAGATA 4260 

CGATTTTTGA TAGGAACTCA TCAATTTTAG ATTTTTAAGC AGCATCAATA AATTGCTTCC 4320 

TTGTTTTGTC ATAATTTTTT TATTTAAAAA ATTATGACma GAGTGTGCTA TTCTTTTTAT 4380 

GAGAGGTGTA TGAATATGAT AAATGTATGT GATAAATGTA TGTGATGTTG GAAAAAGAAT 4440 

AAAAGAACTT AGAATATCTT CAAATCTTAC TCAAGATAAG ATTGCTGAGT ATTTGTCTVT 4500 

GAATCAAAGC ATGATTGCCA AAATGGAAAA AGGTGAAAGG AATATCACGA ATGGATTTAA 4560 

GTAATAAAGC TTCAAATCTT AGAAAAAAGT TGGGAGCTGA TGGTGAATCG CCGATAGATA 4620 

TTTTTAAATT GGTACAAAAG ATAGAAAATT TGACGCTGGT ATTTTATGGA CTCGGAAAGA 4680 

ATATTAGCGG AGTCTGTTAT AAAGGAACTC AGTTCAGTCT CATTGCAGTC AATTCAGACA 4740 

TGCCATTAGG AAGGTAAAGA TTTTCTTTAG CACATGGACT GTATCATCTT TATTATGATG 4800 

AGGTGAAGAA GAGTTCAGTC AGTCTTATCT TGATTGGTGA AGGAGATGAA ACTGAAAGAA 4860 

AAGCGGATCA GTTTGCTTCT TATTTTTTAA TTTTCCCATC TTCACTGTAT AGGATGGTTG 4920 
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AGGAAATCAG AGAAAATGCC AATAGAACTC ATCTTGAAGT AGAAGATATT ATAAAATTGG 4980 

GTCAGTTTTA TGGTATCAGT CATAAAGCTA TGTTATATAG ATTGAGGAAT GATGGATACC 5040 

TTGATGCAGA AGAAATTAAA AATATGGATA TTAGTGTTAT AGAGACAGCT TCAAGATTAG 5100 

GCTATGATAC AAGTTTATAT CGTCCTTTGT CAGAAAGTAA AAAAGAAATG GCATTAGGAT 5160 

AATATATTAA TTCAACTGAA CAACTTTTAG AAAATAACAG AATTTCGCAA GGGAAGTATG 5220 

AGGAACTGTT ACTAGATGCT TTCAGATATG ATATTGTATA TGGGCTAGAT GAAGAGGGGG 5280 

GAGTTGTCGT TTGACTAGTC GTGTATTTAT TGATGCAGAT TGTATTTCAG TATTTTTATG 5340 

GGTTGGCACT GAACATCTTT TAGAAAAGCT CTATTTGGGT AAAATTGTTA TTCCACAAGA 5400 

GGTGTATGAT GAAATCAATA TACCTACAAT TCCCCATTTA AAATCTAGGA TAGATCAGTT 5460 

GGTAGCTAAG GGTTCAGCTG AGATTGTGAG CATAGACATT GGAACTGAAG AATACGCATT 5520 

ATATAGAGAT TTAACAAGAA ATCATGATAG TAACAAGATT ATTGGTAAGG GAGAAGGGGC 5580 

ATCTATTTCC TTAGCGAAAA AGCATAATGG GATATTAGGA AGTAATAACC TAAGAGATGT 5640 

TAAATCATAT GTAGAAGAAT TTTCTTTAGA ATATATGACA ACAGGAGATA TACTGATTGA 5700 

AGCGTTTAAA GCGTAATTTA TTACTGAATA AGAGGGCAAT CATATCTGGA ATAATATGCT 5760 

TAAAAAGAGA AGGAAAATTG GTGCAAATTC ATTTTCAGAC TATCTTCGTG GAAGTATTCA 5820 

TCAAAATAGA CAAAAATAAA TTTGGATAAA TCGAACTCAC TATTCAGGAG GCATATGAGC 5880 

AATTCGAAAA AGAAAAGTGT CAAATTGAGC CTATAGGAGT AGAAGTGAAA TAGTAAGTCC 5940 

TGCATAGTGG ATGAGAGAAA AGTTCTCCTT GAAGTTTTCC TGAACTATCA GTCGCATGTC 6000 

AAACGATATG TAGGGTAATG TGAGAGGGGA TAGCGAGTAG TTTTTGGTTA TTTTATCAAA 6060 

AAACTTATAT TTTATTATAC CGAATGATAA AATATAATAA AAATGATAGA ATAAGGAAAA 6120 

AACATGAATG TCAAAAAGAT AATGTCAATT TTTCAATCCT TTTATGTTGA TGTCAGTATT 6180 

GAGGAACTGA CTTTGACTTT ACCAATCAGT TTTGTAAAAA GGTTTGAGTA TACTCAAATG 6240 

ACTTTTCATA AGGAATCATT TTTATTGATT AAAGAAAAGA GAAGGGGGAG TTTGAGTTCA 6300 

TTTGTTACTC AGGCTCGCAC TATGGGTGAA AAAGCCAATA TGGATGTTGT TTTGGTGTTT 6360 

TCGAAGTTAT CAGACAGTGA AAAAAAGCAA TTACTTCAAG CTAGAGTTCC GTTTGTAGAC 6420 

TTTAAGGGAA ACCTCTTCTT CCCTCCATTG GGACTAGTAC TCAATGCGAA TGATACTGAA _ 6480 

GTCCCTAAGG AATTAACACC TAGCGAACAA TTAACGTGGA TTGCCTTTTT ATTGACAAAA 6540 

GGTCAAAAAG TAGTAGATGT TGATTTGCTT TCACAAGTCA CTGGACTTCC AAACTCAACA 6600 

ATTTATAGGT GTTTGAGGAC TTTTAAAGCT TTATATTGGT TAAACAAGCA AAATAAGCTT 6660 
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TACACATATA 


CGGTGTCAAA 


GAAAGAATTA 


TTCTTAAAAT 


CCGTGTCATG 


TTTATTTAAT 

* * A •» AAA #U% 1 


6720 


CCCATCAAAA 


AACGGATTTT 


ATTGCCAGAT 


GGCGATATAA 


AGCAGATAAA 




o / 0U 


AACCTTCTAT 


ATGGTGGTGC 


TTATGCTTTG 


TCGCATTCAA 


CTTTTTTAGC 




004U 


GAAAATATTA GCTATGTCAT 


ATGGCAGAGA 


AAATTCAATC 


AGTTATCCTT 


RPPAPfTTPT 


conn 

D7UU 


CAGCATGTTT 


TAAAATGAAA 


GATGCTAGAG 


ATATGGAAAT 


ATCGTCCTTT 


1 1 I Aw l\a Ala 


OlrDU 


TTTTGGAATG 


ATTTTAAAAA 


TAATCATGAT 


AAACAATTTG 


TAGATPPGAT 


1 A v. 1LU1A1 


/ U^U 


TTGACCTTAA 


AAGATGATGA 


TGACCCACGT 


ATAGAGGAAG 




1 AljrAAAA I 


7080 


ATGATATTAC 


AGTATCTGGG 


AGAAGATGAT 


GCCAGCTAAT 


AP CI A A AfWP A 
n^unnnu 1 1 A 


TTTTTCAAGA 


7140 


AATGTTTGCG 


GATTTTCAGA 


ACTATTATGT 


TCTGATTGGG 


fin, a a r*TY2 r**p n 


t. (-1 CT ATCGT 


7200 


ATTGGATTCG 


CAAGGATTTA 


AAAGTCGCAC 


AACAAAAGAT 


InluAlnluu 


i uATvJATTGA 


7260 


TGAAGTAAAA 


AATAAGGAAT 


TTTATACTAC 


CTTGAATCAT 


1 A 1 1 XAvAAl 


1 wjliAviAU 1 A 


7320 


TCAAGGAAGT 


CAGAAAGATG 


AGAAAGCGCA GCTTTTTCGA 


TTT AP A A P A A 


U l AA rt-L. a \jA 


7380 


GTTTCCTTCT 


ATGATTGAAC 


TATTTAGTAT 


CTTACCAGAA 


lni — rt 1 1AA 


AG AAVjviAL Wj 


7440 


TCGAGAAATT 


CCCTTACATT 


TTGACCAAGA 


TGCTAGTTTA 


TP ARPPTT A T 




7500 


AGATTATTAT 


AATATATTGG 


TGCATGAAAA 


AGAAACCATT 






7 560 


TAATTGTGGT 


TTATACTCTT 


CGAAAATCTC 


TTCAAACCAC 


GTCACJPTTPP 


ATPT AfA app 


7620 


TCAAAACAGT 


GTTTTGAGCA 


GCCTGCAGCT 


AGCTTCCTAG 


x ± ± \jv# 1V.1 1 i 


o/\ 1111 LA 1 1 


7680 


GAGTATTAAT 


TATTTTTAAG 


GCTAAAGCTT 


GGCTGGATAT 


GAGGGAGCGC 


A V- iuv^ALAb 


11 A f\ 


GTGCTCAAGG 


TTTAAGTAAG 


TCCATTAAAA 


AGCATTTGAA 


TGACCTTACC 


AAA \jnKrf\\3 


/ ouu 


CTTCCTTGCT 


AGGAGATGAA 


AAGTTATCGG 


CTATAACATC 


AAGTAGTGCG 


GTAAAAPPAP 


/ ooU 


ACATGCACCG 


CTTTGTGATA 


GAATTAGAGC 


CTGTGAAGTC 


AACTATTCTT 


CAAAATAATft 


1 ft 


ACATTTCATT 


GGATCAAAAT 


GAAATTTTTG 


AAATTCTGAA 


AAATTTTCTC 


GATGGTTAAA 


7980 


ATAATTGTAG 


CGAGATGGCT 


ATATTGAATT 


CGTCTATATC 


TGGAAACTAG 


AAAAAACTTC 


8040 


AATTTCAGGA 


GAAAATGAAG 


TCAATCTTCC 


CACAATCAAA 


CGTATAGTAT 


CAAGGTTTTT 


8100 


CAAGACCTGA 


TATTATGCGT 


TTTTTGCTTT 


TCAAAACTTT 


TTGCCCAGTC 


TTCGTTTTTA 


8160 


TCCTCTAGTC 


ACTTGATTTG 


TTTCAGGTGG TTTTTTAGTA TAGTAGAATG 


AAACGAGAAC 


8220 


AGGACAAATT 


GATCAGGACA 


GTCAAATCGA 


TTTCTAACAA 


TGTTTTAGAA 


GCAGAAGTGT 


8280 


ACTATTCTAG 


TTTCAATCTA 


CTATAGTTAA ATCTGCGGTC 


AAGTCTACTG 


GTGAATCTAT 


8340 


GATTGTAATA 


CTCTTCCAAA 


ATCTCATCAA 


CCACGTCAGT 


CTTGCCTTGC 


AGTCTGTATC 


8400 


TTACTGACCA 


AGCTAGTGAT 


GGATTTAGAA TAGGTGATTT 


GGAGCGTCCT 


ATTAGCTAGG 


8460 
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AAATGCTGCT 


CATAGTCCTT 


TGCTGAGGCT 


AGGGTGTTTC 


AACATTCAAC 


ACTCAACTGG 


8520 


TTGATCTAGT 


TGATAGGAAG 


GGAGTTACTA 


TAAAATACTC 


AGGCTTCCAT 


CATATTTTTT 


8580 


GAAACGATTG 


TGTAATCAAA 


ATGTACCAAT 


ATTGTAGTAT 


TGGTACAGAA 


GATGTTGTGA 


8640 


ATGGATAAAT 


ATATCATAAC 


TGCTATCTCA 


AAAAGATTTC 


ATATGTCTGT 


GCATATATAA 


8700 


TAGACTTCCT 


GCAAAACTAG 


AATCCTAGTT 


CATGATTGAT 


AATACCAGCA 


ATCAAATTCA 


8760 


TTCGTAATCC 


AAAGCGTTTA 


CGATGATTTC 


GATAGGTTGT 


TGAAAACATT 


TTAAACGTTT 


8820 


CTACTTTGGC AAAGATGTTC TCAACCTTGC TTCTCTCCTT AGATAGCGCA TGGTTATAGG 


8880 


CTTTATCTTC 


AGCTGTTAGC 


GGCTTGAGTT 


TGCTGGATTT 


ACGTGGAGTT 


TGTGCTTGAG 


8940 


GACATATCTT 


CATGAGCCCT 


TGATAACCAC 


TGTCAGCCAA 


GATTTTACCA 


GCTTGTCCGA 


9000 


TATTTCTGCA 


ACTCATTTTG 


AACAACTTCA 


TATCATGACA 


ATAGTTCACA 


GTGATATCCA 


9060 


AAGAAACAAT 


TCTCCCTTGA 


CTTGTGACAA 


TCGCTTGAGC 


CTTCAT AG CG 


TGAAATTTCT 


9120 


TTTTACCAGA 


ATCATTCGCT 


AATTCTTTTT 


TTAGGGCGAT 


TGATTTTTAC 


TTCCGTCGCA 


9180 


TCAATCATTA 


CCGTGTCCTC 


AGAACTAAGA 


GGAGTTCTTG 


AAATCGTAAC 


ACCACTTTGA 


9240 


ACAAGAGTTA 


CTTCAACCCA 


TTGGCTCCGA 


CGGATTAAGT 


TGCTTTCGTG 


AATACCAAAA 


9300 


TCAGCCGCAA 


TTTCTTCATA 


AGTGCGGTAT 


TCTAGGCTTA 


ATTTAGGTTT 


TCGTCCACCT 


9360 


TTTGCGTGTT 


TAAGTTGATA 


AGCTGTTTTT 


AATACAGCTA 


ACATCTCTTT 


AAAAGTCGTG 


9420 


CGCTGAACAC 


CAACAAGACG 


CTTAAATCGT 


GTATCAGTTA 


ATTGTTTACT 


TGCTTCATAA 


9480 


TTTCGCAGGG 


AGTCTATTGA 


CTCTTTGGTA 


GGTGTCAATG 


TTTTTTTCAT 


CTATCCCGAG 


9540 


AATTATTTTC 


CCGCCATTTG 


TATTTGCAAA 


TGCTGAGTAG 


GTTTCCCAGA 


AAGACTCTGG 


9600 


AAGATTGTTT 


TTAGCTTTTT 


TGTATTCTAA 


ATCAACCCCT 


TCAAATTTTA 


AGTCCATATT 


9660 


TTTCCTTTAC 


ATCTGTTTTT 


TGTGGTTCTG 


GTATTTGTTC 


AAGTTGAGTG 


ATAATATAGC 


9720 


GAATTGAATT 


TCGAGAGTTT 


TTACTCAGTT 


AATTTCTTTT 


TTAACCCACT 


TTAATTGCTT 


9780 


TTTTAACACG 


GGTTAAAAAA 


GAAATTAAAG 


TGGGTTAATT 


TTTCTTGA 




9828 


(2) INFORMATION FOR SEQ ID NO: 42: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3369 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
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CCGCGAAAGA TATTTTTGAA CAAGAGTTTG GACGTGAGGT CCGTGGCTAT AATAAAGTAG 60 

AAGTTGACGA GTTTTTAGAC GATGTCATCA AGGACTATGA AACCTATGCT GCCTTGGTCA 120 

AGTCACTTCG TCAGGAAATT GCGGATTTGA AGGAAGAATT AACTCGTAAA CCGAAACCTT 180 

CACCAGTTCA AGCAGAACCC CTTGAAGCGG CAATTACAAG TTCTATGACG AATTTTGATA 240 

TTTTGAAACG CCTGAATAGA TTGGAAAAAG AAGTTTTTGG TAAACAAATT TTAGATAACT 300 

CAGATTTTTA AGTAGTTATT TGAGATGTGC AATTTTTGGA TAATCGCGTG AGGAGAATTG 360 

TTTCTCATGA GGAAAGTCCA TGCTAGCACA GGCTGTGATG CCTGTAGTGT TTGTGCTAGG 420 

CG AAAC CAT A AGCCTAGGGA CGAGAAATCG TTACGGCAGT TGAAATGGCT AAGTCCTTGG 480 

ATAGGCCAGA GTAGGCTTGA AAGTGCCACA GTGACGGAGT CTTTCTGGAA ACAGAGAGAG 540 

TGGAACGCGG TAAACCCCTC AAGCTAGCAA CCCAAATTTT GGTCGGGGCA TGGAGTACGC 600 

GGAAACGAAC GTAGTATTCT GACTGCTATC AGCTAGAGCT GTTAGTGGTA GACAGATGAT 660 

TATCGAAGGA AGTGGTCCTA GTCACTTCTG GAACAAAACA TGG CTTATAG AAAATTGCAT 720 

ATAGGTTGGG GCTGAGAAAT TTTCTCAACC TCATTTTTTA AAGTGGACAT ATAGAAAGGT 780 

CTTGCAAGAC TGTAACATGA AAAAAGAATT TAATTTAATT GCAACTGTGG CAGCAGGGCT 840 

TGAGGCTGTC GTTGGTCGTG AAGTGCGAGA GTTGGGCTAC GATTGTCAGG TTGAAAATGG 900 

ACGTGTTCGT TTTCAAGGAG ACGTGAGAGC TATTATCGAA ACCAACCTTT GGCTTCGGGC 960 

AGCAGATCGT ATCAAAATTA TCGTAGGAAC GTTCCCAGCT AAGACTTTTG AAGAGCTATT 1020 

TCAGGGAGTT TTCGCTTTGG ATTGGGAAAA TTATTTACCA CTTGGAGCTC GGTTCCCGAT 1080 

TTCAAAAGCT AAATGTGTTA AGTCCAAACT TCACAATGAG CCCAGTGTTC AGGCTATTTC 1140 

TAAGAAAGCT GTTGTCAAGA AATTGCAGAA ACACTATGCT CGCCCAGAAG GGGTTCCTCT 1200 

GATGGAGAAT GGCCCAGAGT TTAAGATTGA GGTCTCTATT CTCAAAGATG TGGCAACTGT 1260 

CATGATTGAT ACGACCGGGT CTAGCCTCTT TAAACGTGGT TATCGTACCG AAAAAGGTGG 1320 

CGCTCCTATC AAGGAAAATA TGGCAGCAGC CATTTTACAA CTTTCTAACT GGTATCCAGA 1380 

CAAGCCTTTG ATTGATCCGA CCTGTGGTTC GGGGACTTTC TGTATTGAGG CAGTTATGAT 1440 

TGCTAGAAAG ATGGCGCCAG GTCTTCGTCG CTCTTTTGCA TTTGAGGAAT GGAACTGGAT 1500 

CAGCGATCGC TTGATTCAAG AAGTGCGCAC AGAAGCGGCT AAAAAAGTAG ACCGTGAGCT 1560 

TGAGCTGGAT ATCATGGGCT GTGATATTGA TGCTCGCATG GTGGAAATTG CTAAGGCCAA 1620 

TGCTCAGGTA GCTGGTGTTG CAGGAGACAT TACTTTTAAG CAGATGCGCG TGCAGGATTT 1680 

ACGTTCCGAT AAAATCAATG GAGTAATCAT TTCCAATCCG CCTTATGGTG AACGTTTGTC 1740 

AGATGATGCA GGGGTGACCA AGCTCTATGC TGAGATGGGG CAAGTATTTG CACCGCTGAA 1800 
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AACTTGGAGC 


AAATTTATCC 


TGACTAGTGA 


TGAAGCTTTT 


GAAAGCAAGT 


ATGGTAGCCA 


I860 


AGCAGATAAG 


AAGCGTAAGT 


TATACAACGG 


AACCTTGAAA 


GTGGATCTAT 


ATCAATATTT 


1920 


TGGTCAGCGT 


GTCAAACGGC 


AAGAGGTAAA 


ATAGAAAGGG 


ATACTCATGA 


GTAAAAAAAG 


1980 


ACGAAATCGT 


CATAAAAAAG 


AAGGTCAAGA 


ACCGCAATTT 


GATTTTGATG 


AAGCAAAAGA 


2040 


GCTAACAGTT 


GGTCAAGCTA 


TTCGTAAAAA 


TGAAGAAGTG 


GAATCAGGAG 


TCTTGCCTGA 


2100 


GGATTCCATT 


TTGGACAAGT 


ATGTTAAGCA 


ACACAGAGAT 


GAAATTGAGG 


CGGATAAGTT 


2160 


TGCGACTCGT 


CAATACAAAA 


AAGAGGAGTT 


CGTTGAAACT 


CAGAGTCTGG 


ATGATTTAAT 


2220 


TCAAGAGATG 


CGTGAGGCTG 


TAGAGAAGTC 


AGAAGCTTCT 


TCGGAGGAAG 


TTCCATCTTC 


2280 


TGAAGACATC 


TTACTACCCT 


TGCCTCTGGA 


CGATGAGGAG 


CAAGGCTTGG 


ATCCTCTATT 


2340 


GCTAGATGAT 


GAAAATCCAA 


CAGAAATGAC 


TGAAGAAGTG 


GAAGAGGAGC 


AAAACCTTTC 


2400 


TCGTCTGGAT 


CAAGAGGACT 


CAGAAAAGAA 


AAGTAAAAAA 


GGCTTTATTT 


TGACCGTTTT 


2460 


GGCGCTTGTA 


TCAGTAATTA 


TTTGTGTCAG 


TGCTTATTAT 


GTCTACCGTC 


AAGTGGCTCG 


2520 


TTCGACTAAG 


GAAATTGAAA 


CTTCTCAATC 


AACTACAGCC 


AATCAATCGG 


ATGTGGATGA 


2580 


TTTTAATACA 


CTTTATGACG 


CCTTTTACAC 


AGATAGCAAT 


AAAACGGCTT 


TGAAAAATAG 


2640 


CCAGTTTGAT 


AAACTGAGTC 


AACTCAAGAC 


TTTACTTGAT 


AAGCTGGAAG 


GTAGTCGTGA 


2700 


ACATACGCTT 


GCCAAATCTA 


AATATGATAG 


TCTAGCAACG 


CAAATCAAGG 


CTATTCAAGA 


2760 


TGTCAATGCT 


CAATTTGAGA 


AACCAGCTAT 


TGTGGATGGT 


GTGTTGGATA 


CCAATGCCAA 


2820 


AGCCAAATCG 


GATGCTAAAT 


TTACGGATAT 


TAAAACTGGA 


AATACGGAGC L TTGATAAAGT 


2880 


GCTAGATAAG 


GCTATCAGTC 


TTGGTAAGAG 


CCAGCAAACA 


AGTACTTCTA 


GCTCAAGTTC 


2940 


AAGTCAAACT 


AGCAGCTCAA 


GTTCAAGTCA 


AGCAAGTTCA 


AATACGACTA 


GTGAGCCAAA 


3000 


ACCAAGTAGT 


TCAAATGAGA 


CTAGAAGTAG 


TCGCAGTGAA 


GTCAATATGG 


GTCTCTCGAG 


3060 


TGCAGGGGTT 


GCTGTTCAAA 


GAAGTGCCAG 


TCGTGTTGCC 


TATAATCAGT 


CTGCTATTGA 


3120 


TGATAGTAAT 


AACTCTGCCT 


GGGATTTTGC 


GGATGGTGTC 


TTGGAACAAA 


TTCTAGCGAC 


3180 


TTCACGTTCA 


CGTGGCTATA 


TCACTGGAGA 


CCAATATATC 


CTTGAACGTG 


TCAATATCGT 


3240 


TAACGGCAAT 


GGTTATTACA ACCTCTACAA 


GCCAGATGGA 


ACCTATCTCT 


TTACCCTTAA 


3300 


CTGTAAGACA 


GGCTACTTTG 


TCGGAAATGG 


CGCTGGTCAT 


GCGGATGACT 


TAGATTACTA 


3360 


AGCAGTCGG 












3369 


(2) INFORMATION FOR SEQ ID NO: 43 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9713 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 



AAVj i i i acaa 


TTTAAATGAA 


TTAACAATTT 


TCCCAACTAA 


AAGCACTCCA 


GTTACCGCAA 


60 


wCaTTTGTACT 


GAATGTACTA 


AATCGCATTC 


CATCAACTTC 


ATCTGTTTCG 


TCAACTTGAA 


120 


CAGATACTAA 


TTGAAGATTT 


AATACTTCTG 


CTGCCATAGC 


TAGCTCCTCC 


TATTTAAATT 


180 


TTTGGGATTA 


AGTACTTTAT 


CCACCCTCAT 


ATACTCTCTC 


CACCAGTAAA 


ATGCAAGCAA 


240 


TGATACAAAA 


TAGATTTAAC 


TATTTTATAT 


AGCGAAAACT 


TACAAATTTT 


TAAGAAATAA 


300 


TTTTTGCATT 


CTTAAAGATA 


AAATAGGAAC 


TTTTAGTAAT 


AAATATTAAA 


ATAAATAAAA 


360 


TAATAGATAC 


TATAAAATTT 


GGAAGTATTA 


ACCCCAAAAG 


ATTCATATCA 


TCTATTAAAA 


420 


lArCCTCTAA 


AGAGTAGTAT 


ATTAAAGCCA 


TAATTTTAAT 


GTTAAGTAAA 


AATGCAATTA 


480 


A rGAAGTAAC 


AAATGTCAAA 


AATATAGCCT 


CACCAACTTT 


AATCTTAACC 


ATCTGGTAAT 


540 


TAGAAGTTCC 


TAAAATTTCA 


AATTGCTGAA 


TCTCAATCCT 


TTCTTGATGC 


GATGACAAAA 


600 


ATGCAATTGA 


AATAATATTT 


GCAAGTACTA 


TCAAAATTGG 


TGCTCCTACA 


TAGACAATAA 


660 


ATGCTACTTT 


TAGCTCTAAA 


TCACTGTCAT 


CTTGAAATTG 


AGATAGTATA 


TTCTGAGAAA 


720 


TCATTTGAAA 


ACTAGAAATT 


AGTAATATAG 


CTCCTGTAAT 


TGCAGCACTG 


ATAGATTTTA 


780 


TATAAGACTT 


ACAATATAGT 


AAATTCCACT 


TCGAAACAAT 


GAACATAAAA 


TTATTTCTAA 


840 


ATATAATTAT 


AGAAAGTAGT 


TTGATAAAAC 


ATGACTGTAT 


AAAAGGAGAT 


AATTGATAAA 


900 


TAATCACAAT 


ATCTAAGATT 


ACAATATTGA 


ATATTATCTG 


GGCCTTCGCT 


AAAATTGTGC 


960 


TATCTTGGAA 


AATTTGTTGC 


AAAGAAAGCA 


ACCAGATAAC 


ACTAAAACCA 


GCCAATAGCA 


1020 


GTATTCTTTT 


TACTATTGAA 


AGAACATGCC 


TTATTTTAGA 


ACTCTTCCTA 


TTTCTAATCT 


1080 


TCTTGAACGT 


ATAAAAGCAA 


CCACTTAGAA 


AGGCTAAAAA 


TGAAATCAAC 


ACTACTGTAA 


1140 


TGATACATCC 


AACAGCACTC 


GTTTGAAATT 


GGATATCAGG 


TAATATATTT 


TCCCCGAAAA 


1200 


AGTATTGTAA AAAATAATAA 


TAATTTGACG 


TAACAAATAT 


AGAGCATAGA 


TATGCAATAA 


1260 


AACTAATAAT 


CGAGGAAATG ATAAAAATCT GTCCCCCCAC 


AAGAAATGAT 


AGTTGAAGGC 


1320 


GACTTGCTCC 


CAACACCTCC 


AGAAGTTCGT 


AATCATCTCT 


AAAAATTTCA 


ACCAACATAT 


1380 


TTATTATGTT 


AGAGAGCACA 


AAGAATAATG 


TTACTCCTCC 


GAATACTATC 


GGAAACATAA 


1440 


AAATTGGTTT 


AGGATCTGGA 


AGTCCGACAA 


ATACTTGCGA 


ATTATTCTCA 


ACATTAATTA 


1500 


CCCCATTAAC 


AGCCAATCCC 


ATAACTAAAC 


TCGAAACAAA AATTACTGGT 


GAAACGCCTA 


1560 
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ACCATTGTTT CTTATTATGT AAAAATTGAT AGTAAACTAA TCTGAGCATC TCTATTCCTC 
CGTAGTTGAT TGTACCTCTA AGATTTTATA CAACTCTTCC CCGCTAGGTC TATGAAGTTC 
n ™ MATT TTTCCATCTT TCAATATTAA TGCACGATCA GTTTTCGAGG CCAATTCTAT 
ATCGTGCGTT ACCATAATTA CACACTTACC CGCCCCTACT AAC^CTCA ATAATTCAAA 
AATTACTTCA CGAGAAACGC TGTCTAAAGC CCCAGTTGGC TCATCAGCAA ATATTATATC 
ACTATCAGCA ATAACCGCTC TAGCTATAGC AACC^T ^ CCAC CAGACAGAGT 
TCCAACAAAA TCGTTTAAGC CAGCATTAAA CTTCATTCTT TTGAGTAAGT TTTCTACATT 
TTTAATAGTT AATTTTTTTT GTGATAATCG CAAAGGAAGT GCTATATTTT CTATTACCGG 
CAGGGAAGGT ATTAAATTGT ATGCTTGAAA TATAAAAGAT ACTTCGTTAC GTCTTATACT 
TGACAATTTT GCATTTCTGA TTTTATAGGG GTTGATTCCA TTTAAAATTA CTTCCCCACT 
TGTTGGTTCA AGCAAACTAG AAATACATTT TAATAAAGTT GACTTTCCAG AACCACTAAT 
TCCTAGAATA CTTATAAATT CTCCTCTCGA AGCAGAAAGA GAAACATTTT TCAGCACTTG 
CAACGTTTTA WMTO CT A GTAAAAATTG ATGATACAGC CCTTTCACTT TTAATATATA 
ATCTTTATCC ATATTCTTGC CTCCAATCAC TTAATTTTGA AAAGTGTTCC ATTTUCCAAT 
TTATATATAT CAGTGTATCT CTTGTCATTT AAGTCATAAT GATGTGAAAC TTCAATAAAT 
GAAATACCTA AATTGAACAG AATATCATGT ATGGAATTTG AATTATCATT ATCTAAATTA 
GCTGATATTT CGTCAAATAA GTACACTTTA TTATTTCTAA TCAGAGCTCT AGCTAAACCT 
ATTTTTTGTT T^GACCTCC AGACAAATTA CTACCAr™ CACCACATTG ATAATTTAGT 
ATATCTATCT TTTCTAATTC TTCATATAGA TTTACCTTTT TTAACACCTC AATTATCTGA 
TCATCTGAAA AATATTCATT TTGAAATAAA GTTACGTTCT CACGAATAGT AGTGTCAAAA 
ATATATGGTG TCTGATCAAC TGTTGGTATT GAATCTGAAC TCTTTTCCC ATGTGATAAC 
AAATTTACAT AACC^TTTG TGGCTTTAAA GAACCATTAA TTAAATTTAA AATCGTTGTT 
TTCCCACTAC CAGAAGTTCC TGTTAATAAT ACCCTAAATG GTGACTTAAA TGAGAAGTCA 
ATACTTAATT TATTTTCTGG TGTAATAGAA TATACAACAT CTTTCATGTG TATCTCATCT 
ATTGATGAAG TATACAGTCC GTTATTATCA TGTTCAGCGT CTATAAAATT CTTCTCTCCA 
CTTAAGTATT TTAAAAACGG TTTCCTTAAA TCTTTGGTTG TATTTATCTT ATTTAATGAA 
TAGGCAATTG ATTGTATCGG CCCTAAAACT TTATCGTTTG CTAAGAAAAT ACCTATCAGT 
TCACTAAAAG AAAGGCTTTT ATGATAAATT ACAAAATAAC ATCCTACAAC CAAGGGAACT 
AGAAAGCAAA AACCTGAAAT TAGTACTGCA ACCAATTTTG AAAGAACCTC TGATCGTTTC 



1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 
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AAATTAAAAG TAGAATCTTC TAGTTTATCC AACTTTTTAT CCGACAAACT AATTATTTCT 3360 

TTAGTAACAG AATAAGATTT TAATGTCTTA AAACCATTAA AAATTTCTTT TATTATGTGA 3420 

GTATACTCTG CATTGCTGTT AGAGTACTCA TTAGCTGAAT TAGACAACAT CTTCTTCATA 3480 

AAGACAGGTA CTATAATCGG CAATGCTGAT AATACAATAA ATATTATTGA nACTAGGAAG 3540 

TTTAAATAAA GCATAAAACT TAGAGAGACG ATGAACAACA ATATTGAAGA AATTATTTCA 3600 

AAAATTTGTC TAAAATAGTT TTCTTCGATT AATCTCAAAT CATTTGACAA AACTGAAATA 3 660 

ATAGATGAGT AATCTTTAAC CATTTCAGAA GAAAGATACT GTTCTCTAAA ATATCCTTGT 3720 

TTAATTTTTA CATTTATATC TTTAGTTATT GATGCTTCCG TTACTTCTAA ATAGTAATTT 3780 

GATATATAGA TTGCTGACCA ACCCAGAATA CTTATAGCAC CAAATCTTAG AACGTCAGAA 3840 

AATGAGGAAG TCTGATTTAA ACTACCTGCA TATACAATAA TTCCTGAGAG CAAGACACCA 3900 

TTAAACGAAG ATAGAAATAT TAAAATCCCC ATTAATATAA GTTTAGTCTT TTTTATAAAT 3 960 

TTTAAATAAT TCATAAGTTA TTCCTTCCCA CTTCTTCAAA GAAATAATTT AAAGTATCAA 4020 

TCATTAAGAG AACATCTGAT GGAGTAAAAC CTCCATGACC AGCTGCTTTG TTTAAATACA 4080 

ACAAACTTTT AACTCCAATA GAATTTAATT TCTTTGACCA CTCTATCACT TCGTTATTAT 4140 

TAATATATGG GTCTTTCTCA CCCAAAATAT TAACTATAAC AGTATTTGAG TCTCGTGCCT 4200 

TTTCAATATT TTGCATAGGC GAATATGACT TTATATAAGC CTTTACTTCA GGGTCTCTAA 4260 

TATCTCCCCA CTCTGCTATT TCGGTCTTAG AAAGAGGATC ATTTGGATTC TGAAGTGTAT 4320 

CATAAGGATT TATAAATGGC GAAAATAAGA GAATGCTTTG CAATAAATTT TTTTCCTCGT 4380 

TCAACACCGC ACCAGCAATT ATTCCACCTG CACTAGAAGT TATTAAACCf AATCGCTTAC 4440 

TGTCAATTAC ATCATTTTCC CTTAAATAAT TTACTCCCTC AATAAAATCT CTGATAGAAT 4500 

TCCATTTGTT TAACGCCTTT CCTGAGCGAT ACCATTCACC ACCCAAATAG CCTCCACCTC 4560 

TTACATGAAC TATAGCATAA ATAAAACCTG CATCTATTAT AGATAACATA ATTTCATCTA 4620 

AATCAGAATT ATCATTCTTA CCATAAGCCC CATAGACACT TAGAATACAT TTTTTTCTTC 4 680 

TTGGGAGCTC ATCCGTATCT TCACTTTTCC AAAATAAAGA AATCGGTATG CTTACATCAT 4740 

AACTGTCTTT TTTAGTCCAA ATCACCTTAG AAAAATATTT AGTATTATTC GATTTTATGA 4800 

TGGGTCTTTC AAATTCAGTT TTTAATGTAT TTTCTATTAA ATCAAAACTA AGTATTTTTT 4860 

CGTAAAAAGT TCTCCTCTCT AAAAACAGAA GAACACGATC AGAAAATGAA TTTTCATAAA 4920 

GTGTTGTCTT TTCATCAAAT GTTATCTTAT TAACACTCAA CTCCCTCAAA CTATTATTTT 4980 

TAAATGTAGC AAGATAAAAG ACGGAATTCG CTGCGTTTGA ACAGTCTAAA AGGATATAAC 5040 

GTCCTATACA GTGAACTCTT CTAGCCCTAT CTTGATATGG TATAGTAATA GAAACTCTGT 5100 
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CTCCCGAAGA AGTTTCCCTT AGAATTAGTT GATCTTTCTT TTCTTCAGTT GAAGAGAGCC 5160 

CAAGAAAGTA CTGTGCTTTT TCTGTACTAA ATAGAGCGAT ATCTCTAGGT GTTGGGGCTA 5220 

CCGTTTCTGT GTAAGAGTGT CTAACAAAAC CCGTCCGGTC GAAACTGTAT AGAAAAATCC 5280 

TGCCTTTCTG AAAGTCTACT GACTTTACAA AACAATTATT GCTATCAATG TGGACTATTT 5340 

TTAATCGAAA AGAGCATTCG TTTTCTTCAA ACAGTTCCTC TTCTGTAAAG CTATCAAAAG 5400 

ATTTATAGAA TAACTTACTT GGCCTCCCGT ACTCTTTGGA GCGAGTATAC ATAACACCGA 5460 

ATTTACCCAA ATAGAACGAA CTTTCTACTG AAATATCTTC AATGATAAAT AACTCTTCCA 5520 

TAGTATATTT TTTTATTCCA ATTAAATTAG TCGTACGCAG TGAGGATACA ACCAAAACTA 5580 

TATAACTCTC ATCAGATGAA ATCCTAACAT CCTGTAAGAT ACTATCATCT GGCAAAGTAT 5640 

ATTTTTCCAC ATCAAAGACA ATTTTAAGTG AATTTGAATT GTCTAAACTG GAAGAACTAA 5700 

CCTTAGGAAT CCAGTCATTA TCTTCGACAT ACCATTCCTT TATTACACCA GTATTGGGTA 5760 

TACTCCAATT ATCAAATTGG TACCAATATC GCCCTCTCCT AAATATCAAA GAATTCCATT 5820 

TTTTTAATTC CTGAAATGAT GAAGAGATAG ACCTCTTATA GTGTGTTTTT TCCTGTATTG 5880 

TATTTAAAAA TATTTCATTA CTCTGATTCA CAAGTATGAC CCCTTAATAA TGGTATCTAA 5940 

ATATTATATT TGAGGAAGAA TCGTCAATTT ATTATCCATT ATTGATACCA ATCCAATTGC 6000 

AACACCCGCA AATCCCGAAG CAATATCTGT TGTTATCTTT AAACCATTAT CTCCCGCAAT 6060 

AACAAATCCT TCTTCAATTA CACACAAATA TCTATAAAGT TGTTCAATTA ATTTCTTTTG 6120 

TCCTGAAAAG TTATCATCGA TATCACTATA TATATTATTA GCAACTTCAA GACCACAAAA 6180 

TCCGTTAAAT AAACCTGGTA ATACACAAAA AACTACATCA GTTGCCCTCT CTAAAGAAGT 6240 

TAAATATTTT AAGTATTTGC TTGACAAGAT TTCTTTATTT CTATTAATAA GTAAAAGCAG 6300 

GCCAGCACTT CCAGTTGCTA GATATGGTAG TAATCTATGA CCTTGGCTGT ACTGCAATGA 6360 

ATTATTACTA TCTACTTTAT AAGCAACTAA TTCTTTATCT ACAGCCAATT CTAGACCATT 6420 

TTTATAGATA CTTTCACCAG TTAATTTATA AGCTTCACCG AAGAGCCAAG CTACCCCTGC 6480 

GTGACCATAT AGTAATCCAC CAAAATTCTC ATAAGGATCG TTACTCTGAA CATCACTAGC 6540 

GCCAACTTTA CAAAAAGTTT CTGGATTTTC TATATAATTT AAAGTATATT CTCTAAGCCT 6600 

AATTAGTATT TCTTCTCCTA GTTTATTATC AATTCCCCCT TTACTAAGAA AATACAGTCC 6660 

AACCAGTAAA ATTCCAGCCT GCCCACTATA TAAATTTTTA TTTTGTGAAT TCTCAAATAT 6720 

CTCTATAAAA TGAGTTGTAA AAAGTTCAAC TGCCCGATCT ATCTCCCCAA ATTCATAAAT 6780 

GAGCCAGATT GTACCAATTT TACCATCAAA AAGACCAGAA AGGGACGATT TCTTAAAATT 6840 
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ATTTACTGCC 
TTTTACTTTC 
TCCTTGATTA 
ACGCCACTGT 
ATCTTGCTCA 
AATATTCTCA 
TATATGTTCT 
ACTTACATAG 
TATTTGTCTA 
ACCAGGGGCT 
GTCTATTATA 
ATGATATACA 
TGAAACTCTC 
GGTAACCCAT 
CCAGATCTTA 
TGCTTTGCAC 
ATACGGTCTA 
TATCCCACCA 
TTTTTTATCT 
AATTTTAAAT 
AATAGCAACC 
TGGTGCTTCA 
ACTTTCTAAC 
TTTACCAGAA 
ACTAAGATGT 
TTCATGTGAA 
TTTTCCAATA 
AGGCAGACTT 
GAAAGGATGC 
ACTTAATATA 



TCATTAATAA 

TTAGCTAGTT 

AAATTCAGAC 

CCATAAGTTA 

CTCTCGATAG 

ATCTGTGGAC 

CTTGTATAAC 

TTGTATGTCA 

ATACCATACC 

GCAACTTTAT 

CTAATCTCAT 

TTTTCAGAAT 

AATAGATAAT 

CTATTTAGTG 

CCGTGCCAGT 

TCCGAAGCTA 

GCCTCTTTTA 
CTGTTTGAAA 

TTCTTGTCAA 
ACTGGTAAAC 
TTCTTTTCAT 
TCCCAACGTT 
CTTTGCAAAA 
AATCCTCGAC 
TTAAATGGAA 
CTGTTATACT 
GGGTATTGAT 
ACTTGGTACT 
TCCAAATTGA 
TCTATAGCTA 



CCTGTGTTCG 
GTTGATAACT 
CATAATAATG 
GCGTAAACCC 
TTCTAATCTC 
ATTTAGAATC 
CCAAAGACTC 
AATCCGATGT 
AATCTATCTC 
GTACAACTTT 
CTTCATCCTT 
GAAACTTATT 
CTTTGGTCTT 
GAACGCCCTT 
ATATTTTAGG 
ATTTCTCTGA 
AAATTATTTT 
ATCTAATTGC 
GCCATTTATT 
GTTCATCTTT 
CATCCCTTGA 
TATCGCTTAA 
CCGACTCTAA 
TAACCAATTT 
TTCGCATTTC 
CTGAACTAAT 
AAACCCACTC 
TTATGCTAGT 
AATTATAATC 
GACAGACTTA 



410 
AATCTCATAA 



TAGTCATCAA ACTTGAAATT 6900 

CCAAAGGATA GCTAAATCTG AAAACGCAAT 6960 

AACTGGGAAG AATCTTGATT GAAATTCTTT 7020 

TCTCAATAAT TTTATAATAA AATCTTGTAT 7080 

ATGCATGGGT TTTAAAACTT TTTTCCTGGA 7140 

TAGATATGAC AATAAACTTT CTACATAATC 7200 

AAATAGTTTT TTTCCTTCTA TCCTGGTTTG 7260 

AGTTACTAGT GGCATGTATA AATAATGAGC 7320 

ACTGGGAAGT GTTTCTCGCC ATGCTCTAAA 7380 

TTCATCATTT GAAAAGACAG CCTGTTCCCA 7440 

AACCAAGATA TTTCCTAAAT GTAAATCTTG 7500 

CGTTAAATCG ATGAGTTTTT CTACTATCTT 7560 

ATCAACAACT TCATATAAAG GAAAATTATT 7620 

CATATGTTCA ATTCCTAAGA AGGTGTGCTC 7680 

CGTCTCACTC CATTCATTTA GAATTTTTAG 7740 

AGAATAAGTA CCATCAAATC CTAGACCTGT 7800 

TTTCCCATCT TCTTTTAGCC TAGCATTATA 7860 

ATTATCTATA ATAAAGGGAA AGTCTCCCTG 7920 

CAAAAAGTCA GGGGGCACTA TACCTTTTGG 7980 

AACAACTTCA TCGCCAACAA TTAATTCATC 8040 

CGGCCTAAAC ACACCATACC TCAGATATAT 8100 

AATATATGGC CCATTATATT GCTTTAAGGC 8160 

TTCATTTTGA TTTGGATAAC ATGTAATAAA 8220 

CCCGTTTCGC ATGATAAATT TGTCTTCTGT 8280 

ATGGCAAATT TTTGCTACAT CTTGTAACAA 8340 

GTGTATTTTC CACCCTTGTC TTTCAACAAA 8400 

ATCATTATTC ATTACTTCGT GCCAATTAAA 8460 

ATCTGTACTA TAATCATTAT TAGTGAAAAA 8520 

CATAACAAAA TCTCCAAGAA ATTTTATCAA 8580 

TTTAAATAAA AAGGGAGAAT CCTTTGGATT 8640 
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-CCATACAG AATCGTAGT ATAGTTmACT CACCGATACA AAGAAATTTC AATAAGTATA 

~ :™ ~ — ~ ™: 

TACCTCCGA AATAAATCTT CATAATCTAA ATCTAATACC TGCACAATCC m 
(2) INFORMATION FOR SEQ ID NO: 44: 
(i) SEQOKNCE CHARACTERISTICS- 

B) TYPE: nucleic acid 
<CJ STRANDEDNESS: double 
(D) TOPOLOGY: linear 

<xi» SEQUENCE DESCRIPTION : SEQ ID N0 . 44 . 

TCA — — - -AOCACACC OGGCTAAGCA CTATCCAGAG 
— — « — - *™ ACAATACGGT TGAAMCTTA « 

"~ ~ — G ATGCCCTCCT GCOT^ 
~ TGGTCAAACG TCAGATGACC «A AGGATCCCTA 



8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
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TTCAACATTG 
TGGATCTGGG 
CTCCTCTGGA 
AAGGAAATTC 
GTCCGAGATA 
GCAGTGACAG 
TACTTGATTC 
GCAGCATTAA 
GAAATCCAAG 
TACGCTTTTG 
A6TCTACTAG 
ACTCGTCGTA 
GGTCTCGGCA 
GGCTTGACAA 
GATGGTGGTA 
CGTGAATGGT 
ATTCGCTAAG 
AAAACGTTAA 
ATTATCTCAC 
CAGTTGGTGG 
AGTTTCCACT 
AATCGCGACA 
ATCTTGCAGG 
CAACAAGAAG 
GGAAATATGG 
GGTCGTGGTG 
TCTCAGTTTT 
TTTGCCAACT 
TTCTGGAAAC 
ATGAACGGCT 



AGGAGAACTG 
AGCGCAAGTA 
AAGAGACTGG 
TCCATCTGTG 
CGGACCGTAA 
GTATGACTTG 
CGTCAAATAT 
ACCTAGCTGA 
AAGGAATCAA 
AAGTGGATGG 
CTGCGCCCTA 
CCATTTTGAG 
GTTCTCATAC 
CAAGAGATAA 
CAGGTGTCAT 
TCTCCTGGGC 
GGGCTCGCTT 
AATTTAAATT 
ATAGTCACTG 
AATTGTTTGA 
TGGATGGACA 
AGGTCCAACG 
ATGACTACTT 
CTGCCAAATG 
GACAAGCGCC 
TGAAGCCGAT 
CAGAAATGTA 
GGTACAGTAA 
AAAAATTGTC 
GTGACCACCA 



GAAAGGGCAC 
TGAGGTGGAT 
CGAGACTAGT 
GACGGTGGAA 
GGAAGACACC 
GTCAGCTTTT 
GTTTGCTGTA 
TAGCCAGAGT 
AAACTACGCT 
CCTAGGAAAT 
TCTGGGCTAC 
CTCTGAAAAT 
CTTCTATCGC 
GGCAGAGAAA 
GCACGAAAGC 
TAACATGATG 
TAGCTCAACC 
TAGAATGAGG 
GGATCGTGAG 
CAATCTCTTT 
AACTATTGTC 
CTACATTGAC 
GATCTCCAGT 
GGGTAAATCA 
TCAAATTCTT 
TGGATTTGAC 
CTGGCAGGGT 
CGGGAATGAA 
AGATGTGCGT 
GCCTGTACAG 



412 
CACGAGACTG 



TCGCTTTGCT 
CAGTTTGATG 
CAAGACCACA 
TTGGTAAATG 

cgtccgagtg 

GTAGTCTTGG 
GTTATTGCTG 
TACACCACCA 
GCCAGCATCA 
TGTTCGGTCG 
CCATACTTCT 
TATATCTGGC 
AAATTCTTGC 
TTTCATGTAG 
TTCTGTGAGT 
GATTCTTATC 
TTTTACTTCA 
TGGTACTTGC 
GATCTCTTTG 
CTTGATGACT 
GAGGGCAAAC 
GAAGCCAATG 
ACCCAGATTG 
CAAAAATCAG 
AACCAAGTCC 
GTGGATGGTA 
ATTCCAGTTG 
GCCTACGCTT 
AAAAATCTGA 



ACCACACAGA 
ATCCTTTGCA 
AGATTTTTGT 
AGAACTCTCC 
ATGGCTTTGG 
ATGACTGTTG 
GTTATGTGCA 

ATGCCAAGCG 
ACAGCAAGGG 
TGGATGATCC 
ATGATGAAGT 
ACCAAGGAGA 
CAATCGCCCT 
TGGATCAGCT 
ATGATCCGAC 
TGGTCTTGGA 
AGAATCACAA 
TGGAAAATGT 
CTTTTGAAAG 
AAAATGACCC 
ACTTACAAAT 
TTAAAATTGG 
TCCGCAATAC 
GCTACTTTCC 
GCATTCACGT 
TTGAAGATGA 
GTCGTGTTTT 
ACAAAGATGA 
CGACCAACCA 
GCGAAGCCAT 



CCTTAACGGC 
GTTGGCTTAT 
CGCAGCGACT 
TTATCGTTTT 
ACCTGACTTT 
CCAGTATAGT 
AGAAATCTTC 
TCTTCAGGAT 
CGAAAAGATT 
AAATGTACCA 
GTATCAAGCT 
ATACGCAAGC 
TTCTATCCAA 
GGTTGCCTGC 
CCTCTACTCT 
TTACTTGGAT 
GTTTACATTT 
TGTTGTACAT 
CCATCGTATG 
TGAGTTCAAG 
TCGCCCTGAA 
TCCCTTTTAC 
CTTGATTGGT 
AGATACCTTT 
GGCGGCCTTT 
GCAGTTTACG 
AGGTATTCTC 
GGCCTTGACC 
ATGGTTGATG 
TCGTGTGGCA 



360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
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AATGAACTCT TCCCGGATGT AATCTTTGTT CATAGTTCTT TTGATGAATA TGTTCAAGCT 2160 

GTAGAAGGTG CGCTTCCTGA ACACTTATCA ACTGTTACAG GCGAGTTGAC CAGTCAGGAA 2220 

ACAGATGGCT GGTACACACT TGCCAACACT TCTTCATCCC GCATTTACCT AAAACAAGCC 2280 

TTCCAAGAAA ATAGCAACCT CCTAGAGCAA GTGGTAGAAC CCTTGACTAT TATCACTGGT 2340 

GGACACAACC ACAAGGACCA GTTGACCTAT GCTTGGAAAA CACTTTTGCA GAATGCGCCA 2400 

CATGATAGTA TCTGTGGCTG TAGCGTGGAC GAAGTTCACC GCGAGATGGA AACGCGTTTT 24 60 

GCCAAGGTCA ACCAAGTAGG AAACTTTGTT AAAAGTAACT TGCTCAACGA GTGGAAGGGT 2520 

AAAATTGCTA CGGATAAGGC TCAAAGTGAC TATCTCTTTA CTGTCATTAA CACAGGCTTG 2580 

CATGATAAGG TCGATACTGT CAGCACAGTG ATTGATGTGG CGACTTGTGA TTTCAAGGAA 2640 

TTGCACCCAA CAGAAGGCTA CAAAAAGATG GCTGCTCTTA TCTTGCCAAG TTACCGTGTG 2700 

GAGGACTTGG ATGGTCGTCC TGTAGAGGCT ACAATCGAAG ACCTCGGAGC TAATTTTGAG 27 60 

TATAATTTAC CAAAAGACAA GTTCCGCCAA GCTCGTATTG CTCGTCAAGT GCGCGTGACC 2 820 

ATTCCAGTTC ACCTAGCGCC GCTTTCTTGG ACAACCTTCC AATTGCTGGA AGGAAAACAA 2880 

GAAGACCGTG AGGGTATTTA CCAAAACGGA GTGATTGATA CACCATTCGT AACGGTGAGT 2 940 

GTGGATGACA ACATCACAGT CTATGACAAG ACAACTCACG AAGCCTATGA AGACTTTATC 3000 

CGCTTTGAAG ACCGTGGGGA CATCGGAAAC GAGTATATCT ATTTCCAACC AAAAGGAACA *3060 

GAGCCAATCT TTGCAGAGCT TAAGGGCCAC GAGGTCTTGG AAAACACAGC TTGCTATGCT 3120 

AAAATCTTGC TCAAACATGA ATTGACCGTG CCTGTCAGTG CGGATGAAAA GCTAGAAGAA 3180 

GAGCAACAAG GTATCATCGA GTTTATGAAG CGTGAGGCTG GACGGTCAGA AGAATTGACA 3240 

AACATTCCTC TGGAAACTGA GTTGACTGTC TTCGTTGACA ATCCACAAAT CCGCTTCAAG 3 300 

ACTCGCTTTA CTAACACTGC CAAGGATCAC CGTATCCGTC TCTTGGTCAA GACTCATAAC 3360 

ACGCGTCCAA GCAATGATTC TGAAAGTATC TATGAGGTGG TGACACGACC AAACAAACCA 3420 

GCTGCTTCAT GGGAAAACCC TGAAAATCCT CAACACCAAC AAGCTTTTGT CAGTCTGTAT 3480 

GACGATGAAA AAGGGGTGAC TGTATCCAAC AAGGGATTGA ATGAATACGA AATCCTTGGG 3540 

GATAACACCA TTGCCGTGAC CATTTTGCGT GCATCAGGTG AGCTAGGTGA CTGGGGCTAC 3 600 

TTCCCAACGC CAGAAGCACA ATGCTTGCGG GAGTTTGAAG TCGAGTTTGC ACTTGAATGC 3660 

CACCAAGCCC AAGAACGCTT CTCAGCCTAT CGTCGTGCCA AAGCCTTGCA GACACCGTTT 3720 

ACCAGCCTTC AGCTTGCTAG ACAGGAAGGA AGCGTGGTTG CGACTGGTAG CCTCTTGAGC 3780 

CATTCTGTTC TCAGCATACC GCAAGTTTGT CCAACAGCCT TTAAGGTAGC TGAAAATGAA 3840 
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GAAGGCTATG TGCTTCGTTA CTACAATATG TGTAGTGAAA ATGTACGTGT GCCAGAAAGT 3900 

CAACATCTCT TCCTTGACCT ACTTGAACGA CCATACCCAG TTCATTCAGG ACTATTGGCT 3960 

CCACAAGAGA TTCGTACAGA ATTCATCAAA AAAGAAGAAA TTTAATTTCA AAAAGTAAAC 4020 

ATCAAAAGAA AGGAGGGGCG AAAAAGTAAG AACTAACTGC TGATTCGCCC CTTTTATGGT 4080 

AAAAACAATG ACCATTGCAA CGATTGATAT CGGAGGGACT GGGATTAAGT TTGCCAGTCT 4140 

GACTCCTGAT GGGAAAATAC TGGATAAGAC AAGTATTTCA ACGCCTGAAA ACTTGGAGGA 4200 

TTTACTAGCG TGGCTAGATC AACGCTTGTC AGAACAGGAT TACAGTGGGA TTGCTATGAG 4260 

CGTTCCAGGT GCAGTCAATC AAGAGACAGG TGTGATTGAT GGCTTCAGTG CGGTGCCCTA 4320 

CATCCATGGC TTTTCTTGGT ATGAGGCGCT TAGCTCTTAT CAGCTACCTG TCCATTTAGA 4380 

AAATGATGCC AACTGCGTTG GACTCAGTGA ACT ACT AG CT CATCCAGAGC TTGAAAATGC 4440 

AGCCTGTGTC GTGATTGGGA CAGGGATTGG CGGAGCCATG ATTATCAATG GTAGACTTCA 4500 

TCGAGGTCGC CACGGTCTGG GTGGAGAATT TGGCTACATG ACAACCCTTG CCCCTGCTGA 4560 

AAAACTTAAT AACTGGTCGC AACTAGCATC AACTGGGAAT ATGGTACGAT ACGTGATTGA 4620 

AAAATCTGGT CATACTGATT GGGACGGTCG CAAGATTTAC CAAGAGGCCG CAGCTGGTAA 4680 

TATCCTTTGT CAAGAAGCCA TTGAGCGCAT GAACCGCAAT CTGGCGCAAG GCTTGCTCAA 4740 

TATCCAGTAT CTGATCGATC CAGGTGTCAT CAGTCTGGGT GGCTCTATCA GTCAAAATCC 4800 

AGATTTTATC CAAGGTGTCA AGAAGGCTGT TGAAGACTTT GTCGATGCCT ACGAAGAATA 4860 

CACGGTCGCA CCAGTTATCC AGGCCTGCAC CTATCACGCA GATGCCAATC TCTACGGTGC 4920 

TCTTGTCAAC TGG CTACAGG AGGAAAAGCA ATGGTAAGAT TTACAGGACT TAGTCTCAAA 4980 

CAAACGCAAG CTATTGAGGT TTTAAAAGGT CACATTTCTC TACCAGATGT GGAAGTGGCT 5040 

GTCACTCAGT CTGACCAAGC ATCTATCTCT ATCGAGGGTG AGGAAGGTCA CTATCAATTG 5100 

ACCTACCGCA AACCTCACCA ACTTTATCGT GCCTTGTCCT TGTTGGTAAC AGTTCTAGCA 5160 

GAAGCTGATA AAGTAGAGAT TGAGGAACAA GCAGCTTACG AAGATTTGGC TTACATGGTT 5220 

GACTGTTCTC GAAATGCGGT GCTGAATGTG GCTTCTGCCA AGCAGATGAT TGAGATATTG 5280 

GCTCTCATGG GCTACTCAAC CTTTGAGCTT TACATGGAAG ACACTTACCA GATTGAAGGG 5340 

CAGCCTTACT TTGGCTATTT CCGTGGAGCT TATTCAGCAG AGGAGTTGCA GGAAATCGAA 5400 

GCCTATGCCC AACAGTTTGA CGTGACCTTT GTACCATGCA TCCAGACCTT GGCCCACTTG 5460 

TCGGCCTTTG TCAAATGGGG TGTCAAGGAA GTGCAGGAGC TCCGTGATGT AGAGGACATT 5520 

CTTCTCATTG GCGAAGAAAA GGTTTATGAC TTGATTGATG GCATGTTTGC CACGTTGTCT 5580 

AAACTGAAGA CTCGCAAGGT CAATATCGGG ATGGACGAAG CCCACTTGGT TGGTTTGGGA 5640 
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CGCTACCTGA 
CGCGTGCTGG 
TTCAAACTCA 
CGTGTCTACC 
GATAGCGAGG 
" GCATTTGCAG 
CTAGTGGCTA 
ACGGGTTGGG 
ATCTGGGCAG 
AATACTGGTC 
CTACCAGGCA 
TGTCCGATTC 
GCTGAGACGC 
CAGGCCCAGT 
GCCTACCAAG 
CTTAGAAGCC 
AAGGTCTTTG 
CGAGCAGAAA 
GAAGTTGAAA 
GCCAACCAGT 
AAAATCTCTT 
CTGCAGCTAG 
TGCTTGGCGC 
TAGACAAACT 
TATGCTACAC 
AGCTTGGATC 
AACAAAAACG 
TCATGCTCAC 
TAGCTTTGTT 



TTCTGAACGG 
ATATTGCTGA 
TGTCAGCGGA 
TAGACCGTCT 
AAAAATACAA 
GGGGAGCTTG 
TCGAGGCTAA 
GAGACAATGG 
AACTCAGCTA 
TAACGGTTGA 
ATCTCAGCGG 
TTGATCAACA 
TTGCTAACAT 
TGAATGCTAT 
CGGATGATAA 
AAATTGAAGA 
GTTTGGATAC 
GCCGTATCGA 
TCCTACCATT 
GGCATACCAT 
CAAACCACGT 
CTTCCTAGTT 
AGGGTGTTTC 
TATGATAAAA 
TTAAAATTAG 
ATTTCCGTCA 
TTAGCGGAAC 
GCATTTTTCA 
AGTGACGATG 



415 

TGTTGTGGAT CGTAGTCTCC 
CAAATATGGT TTCCACTGCC 
TGGCCAGTAC GACCGTGATG 
CAAAGACCGT GTGACTCTGG 
CCGTAATTTC CGCAATCATC 
GAAGTGGATT GGCTTTACAC 
TAAAGCCTGC CGTGCCAATC 
TGGTGAAACT GCCCAGTTCT 
TCGCAATGAC CTAGATGGTT 
GGATTTTATG CAGATTGACC 
TATCAATCCC AACCGCTATG 
CATGACACCT GAACAGGACA 
TAAAGAAAAA GCTGGAAACT 
TTTAAGTAGC AAAGTAGATG 
AGAAAGTTTA CAACAAATCG 
CTTCCATGCC CTCTTTAGCC 
AGTTGACATC CGTATGGGCG 
GGTTTATCTG GCTGGTCAGC 
TACTGACTTC TACGCAGACA 
TGCGACAGCG TCGACGATTT 
CAGCTTCCAT CTGCAACCTC 
TGCTCTTTGA TTTTCATTGA 
GCGTGAAACA GAAGAATTAT 
TAGCAGAAAG TGAATGTTTC 
TAAAACAAGA AATAGAAGCT 
AATTTAAAAA AGGTTGCTAC 
TAGCAGATGT GATTTTATGG 
TGGATAATGT TGAGTGGAGT 
TTGAAGAACG TTACACAGAA 



TCATGTGCCA 
AGATGTGGAG 
TGGAAATTCC 
TTTACTGGGA 
ACAAGATTAG 
CTCACAACCA 
AGATTAAAGA 
CTATCCTACC 
TGTCTGCGCA 
TTGCCAACCT 
TTTTTTATCA 
AACCGCACTT 
ATGCCTATCT 
TGGGACGACG 
CCAGACAAGA 
ACCAATGGCT 
GACTCTTGCA 
TTGACCGCAT 
AGGATTTCGC 
ATACGACTTA 
AAAACAGTGT 
GTATAAAAAC 
CTGGTTTCAA 
CTAAGAGCAA 
GAAAAGCCAG 
CGACATATAT 
AGTTTTGATT 
CATGCAGATT 
AATGTCTATC 



ACACTTGGAG 
TGATATGTTC 
AGAGGAAACT 
TTATTATCAG 
CCATGACCTT 
TTTTAGCCGT 
AGTCATCGTA 
AAGCTTGCAA 
CTTCAAGACC 
CTTACCAGAC 
GGATATTCTT 
CGCTCAGGCT 
CTTTGAAACT 
CATTCGTCAG 
ATTACCAGAA 
GAAAGAAAAC 
ACGCATCAAA 
CGACGAGCTG 
AGCAACTACA 
ATATTCTTCG 
TTTGAGCAAC 
AAGAACACCT 
ATGCTACAGT 
TTGGAGGTAT 
CATCTGTAGA 
AGATTCCAAA 
TTGCAAATGA 
CTTACTTTCG 
TGGATAGCCT 



5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 



WO 98/18931 



PCT/US97/19588 



416 

AAGTGTCAAA CAAAAATTTA AGTTTATTTT CGACTTCGGT GATGAATGGC GTTTTGAATG 7440 

CCAAGTGCTG AGAGAAATCG AGACAGAGGA CGAAGAAGCT TATCTCGTAC GTTCGGTTGG 7500 

AACGTCGCCA GAACAATATC CAGATTATGA TGGTTTTGAC TATGAAGAAT GGTAAAATTG 7560 

AAATCAGTCT GTGTAGGCTT AGTATTTCAA TAGACTTCCT GCAAAACTAG AATCCTAGTT 7620 

CATGATTGAT AATACCAGCA ATCAAATTCA TTCGTAATCC GAAGCGTTTA CGATGATTTC 7680 

GATAGGTTGT TGAAAACATT TTAAACGTTT TTACTTTGGC AAAGATGTTC TCAACCTTGC 7740 

TTCTCTCCTT AGATAGCGCA TGGTTATAGG CTTTATCTTC AGCTGTTAGT GGCTTGAGTT 7800 

TGCTGGATTT ACGTGAAGTT TGTGCTTGAG GACATATCTT CATGAGCCCT TGATAACCAC 7860 

TGTCAGCCAA GATTTTACCA GCTTGTCCGA TATTTCTGCA ACTCATTTTG AACAACTTCA 7920 

TATCATGACA ATAGTTCACA GTGATATCCA AAGAAACAAT TCTCCCTTGA CTTGTGACAA 7980 

TCGCTTGAGC CTTCATAGCG TGAAATTTCT TTTTACCAGA ATCATTCGCT AATTCTTTTT 8040 

TTAGGGCGAT TGATTTTTAC TTCCGTCGCA TCAATCATTA CCGTGTCCTC AGAACTAAGA 8100 

GGAGTTCTTG AAATCGTAAC ACCACTTTGA ACAAGAGTTA CTTCAACCCA TTGGCTCCGA 8160 

CGGATTAAGT TGCTTTCGTG AATACCAAAA TCAGCCGCAA TTTCTTCATA AGTGCGGTAT 8220 

TCTAGGCTTA ATTTAGGTTT TCGTCCACCT TTTGCGTGTT TAAGTTGATA AGCTGTTTTT 8280 

AATACAGCTA ACATCTCTTT AAAAGTCGTG CGCTGAACAC CAACAAGACG CTTAAATCGT 8340 

GTATCAGTTA ATTGTTTACT TGCTTCATAA TTTCGCAGGG AGTCTATTGA CTCTTTGGTA 8400 

GGTGTCAATG TTTTTTTCAT CTATCCCGAG AATTATTTTC CCGCCATTTG TATTTGCAAA 8460 

TGCTGAGTAG GTTTCCCAGA AAGACTCTGG AAGATTGTTT TTAGCTTTTT TGTATTCTAA 8520 

ATCAACCCCT TCAAATTTTA AGTCCATATT TTTCCTTTAC ATCTGTTTTT TGTGGTTCTG 8580 

GTATTTGTTC AAGTTGAGTG ATAATATAGC GAATTGAATT TCGAGAGTTT TTACTCAGTT 8640 

AATTTCTTTT TTAACCC 8657 
(2) INFORMATION FOR SEQ ID NO: 45: 

(ij SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11384 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

TCTATTTTGG GTATAGACTT ACCTATAAAG AAAAATATCT ATACACTGCC TTACTAGCTA 60 

TACTGAACGA GTCAACAAAA ACGATATATA TTGATGATAT AAATACAGCA AGATTTTTTA 120 
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ACTTCTTTGG 
CAGAAATTGT 
TGGAACATAT 
TAAATATTTT 
TGGTTGATCT 
AAGGTACAAA 
TGCTAGTGTT 
AGCAACCACT 
TAATGGCTAT 
TTGGTGGGAT 
GGTTAGAGTG 
TATTCTTCAA 
ATTTGCTAAC 
TATCCAGGTA 
AGAAAAGGCT 
AGATGCTCTG 
ATTGGCCACT 
AGAAAGCGGT 
TAAATCAAGC 
ACAATACAAA 
ATTCTCTGTA 
GCATTTTGAT 
TACTATGATG 
TAATTTAATG 
CGTCTCAAAT 
GGCTACCCTC 
AAAGCACTAA 
CATAAAGAGA 
CACCAAGTTG 



CAATGATATT 
TTCATTTAGT 
ATTAAAATCA 
AGAAGATCAA 
CATGTATAAA 
TACATGAATA 
TATCTAGGCG 
AAAAAGGGCG 
ACAGTTAAAG 
AGTTACAAGA 
CATTTATTGC 
CAGCAAGTAA 
TACTCTTTGC 
ATACAATACA 
GCTGTCAAAT 
GATCAATCAA 
GGTTGCCGTA 
GTTATCAGCA 
GCTGGTTATC 
AACCGTCAAC 
TTTACGGAGA 
GCTGCTGGAG 
CTCTATGCTC 
ATCACTGAAA 
TATGAAACAG 
TTACTATACC 
GGGAAAGCGC 
TTATTTTTTA 
GTCTTCGATA 



CCTAATTCGT 
ACGTACGTTC 
AGTTTTTTAT 
TTAACAAAAC 
TACCTAACAA 
TCAAAGAAAA 
TTGACCAACT 
TTAAAGTAAA 
ACAAGCCGAC 
ATACAGTTAA 
CTGTATTTGG 
ACAAATGGGC 
TCCATAACAT 
ACCCAGCTAA 
ACTTAGACAA 
ATTATGAGAA 
TTAGTGAGGC 
TCAATAAGAC 
GTGATATACC 
AAATTCAGTC 
AATATGCTTA 
TAACTAACGT 
AGGTTAGCCC 
ATACTTACTG 
CTATCAACAA 
AAAAATTAGT 
CCCAAAGTGC 
AGGTTGTAGA 
CGAAGCAATT 



m 

CTTTAAAAAA 
GACAACGTTC 
TAGAGAATAT 
ATAGAACAAT 
AACCACGCGC 
AATCAAAAAG 
AACGGGCAAA 
AGCGCGTGAT 
AATTACAACA 
GCCAAATACT 
CGATTACAAG 
TGACAAGGCA 
GAATAAGCGT 
TGATGTCATC 
CAAAGAATTA 
CTTATTTGAT 
TCTGGCTCTT 
ACTAAACCGC 
AATAGACAAA 
TTGGAAATTA 
TGCTTGTAAC 
ATCATTTCAT 
GAAAGATGTT 
GCATACTAAC 
TTTATAAAAA 

AGGGGTAGTA 
TTATTTCAAA 
ATGATTTCAA 
GGTTGTATTT 



AATTGACTAT 
TAAAGTAATT 
AGATGTTTCT 
CAAAATTAGT 
CTTGCCTGCT 
AATGGCCAAA 
AAAGCCCGTA 
GCGATCAATA 
TATAATGAGC 
CGCCAATCCA 
CTATCTAAAC 
AATAAAGGCG 
ATTTTGAAAT 
GTTCCACGCA 
AAACAGTTTC 
GTTGTTCTGT 
GAATGGTCTG 
TATCAGGAAA 
GCCACATTAC 
GGCCGATCTG 
TTACGCAAAC 
GGTTTCCGCC 
CAGTATAGAT 
CAAGAGAATG 
ATAAGGGTGA 
AAAAGGGTAT 
GGCTTTATAG 
TCCACGATAT 
AGCGATGCGG 



ATCGCACCTT 
CCTAAAATTT 
GGTTACACTG 
AAAAACTAAC 
GATGGAAAGA 
GAGTTTATTA 
CAACTGTTAC 
CTTTTGCTGC 
TTGTAAAAGT 
TGGAGGGATT 
TTACTACGCC 
AAAAAGGGGC 
ATGGCGTAGC 
AACAGCAAAA 
TTGATTATTT 
ATAAGACTTT 
ATATTGACCT 
TAAACTCACC 
TTTTACTGAA 
AAACAGTTGT 
GCCTAAATAA 
ATACACATAC 
TAGGCCACTC 
CAAAAAAAGC 
CCCATTTCCG 
TAAATTATAA 
CCTATAATCA 
TCAGCTACTT 
TCTGTACGTG 



180 
240 
300 
3 60 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
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AAAGTGAACC AGTCTTGATT TGTCCTGCGT TAGTTGCAAC TGCAATATCA GCGATTGTTG 1920 

AATCTTCAGT TTCACCTGAA CGGTGTGATA CAACAGCAGT GTAACCAGCT TCTTTAGCCA 1980 

TTTCGATAGC TTCAAAAGTT TCAGTAAGAG TACCGATTTG GTTAACTTTG ATAAGGATTG 2040 

AGTTAGCAGC ACCTTCTTGG ATACCACGTG CAAGGTAGTC AGTGTTTGTT ACGAAGAAGT 2100 

CGTCACCAAC AAGTTGTACT TTCTTACCAA GACGTTCAGT AAGAGCTTTC CAACCATCCC 2160 

AGTCGTTTTC ATCCATACCA TCTTCAATAG TGATGATTGG GTATTTGTTA ACCAATTCTT 2220 

CAAGGTAGTC GATTTGTTCT GCAGATGTAC GAACAGCAGC ACCTTCACCT TCAAATTTAG 2280 

TGTAGTCGTA AACTTTACGT TCTTTATCGT AGAATTCTGA TGAAGCACAG TCAAATCCGA 2340 

TAAATACGTC TTTACCTGGT ACATATCCAG CAGCTTCAAT CGCAGCAAGG ATAGTTTCAA 2400 

CACCATCTTC AGTTCCTTCG AAACGAGGAG CGAATCCACC TTCGTCACCT ACGGCAGTTT 2460 

CCAAACCACG TGATTTAAGG ATTTTCTTAA GAGCGTGGAA GATTTCAGCA CCGTAACGAA 2520 

GGGCTTCTTT AAATGTTGGC GCACCAACTG GCAAGATCAT GAACTCTTGG AAAGCGATTG 2580 

GAGCGTCAGA GTGAGAACCA CCGTTGATGA TGTTCATCAT TGGAGTTGGA AGAACTTTAG 2640 

TGTTGAATCC ACCAAGATAG CTGTAAAGTG GGATTTCAAG GTAGTCAGCA GCAGCACGAG 2700 

CTACAGCGAT AGACACACCG AGGATTGCAT TCGCACCCAA TTTACCTTTG TTAGGAGTAC 2760 

CGTCAAGTGC GATCATAGCA CGGTCAATAG CTTGTTGATC ACGTACATCG TAGCCAATGA 2820 

TAGCTTCAGC AATGATGTTG TTTACGTTGT CAACAGCTTT TTGTGTACCA AGACCACCGT 2880 

AACGAGATTT GTCACCGTCG CGAAGTTCAA CTGCTTCGTG TTCACCAGTA GAAGCTCCTG 2940 

ATGGAACCAT ACCACGTCCG AAAGCACCTG ATTCAGTGTA AACTTCTACT TCAAGTGTTG 3000 

GGTTACCGCG TGAGTCTAGG ACTTCGCGAG CGTAAACATC AGTAATAATT GACATTTTTT 3060 

ACTCTCCTTA TGAGTTAAAT TTTTTACACC TCTATAATAC CTTAAAACCC CTCCTTTTTC 3120 

AAGAAAAAAC GTTATCTTTG TGCAACTTTT CCTTAACTTT ATAAAGTAAT CGCTTTCTTT 3180 

TGTCTGTTTT ATTCTAACTT TTATGATATA CTGTTTTCAT GACAGATTTA TCAAAACAAT 3240 

TACTTGAAAA AGCTCATGGT GGGTTAAAAA TAAATCCGGA TGAGCAAAGA CGCTATCTTG 3300 

GTACTTTTGA GGAAAGAGTT CTTGGATATG TAGATATTGA CACAGCAAAT AGCCCTCAGT 3360 

TAGAAAAAGG CTTTTTATTT ATTTTAGAAA ACCTTCAGGA AAAAGCAGAG CCACTATTTG 3420 

TGAAGATTTC ACCAACTATC GAATTTGATA AGCAAGTTTT CTACTTAAAA GAAGCAAAAG 3480 

AAACTGATAG TCAAGCCACC ATAGTATCTG AAGAGCATAT TACTTCTCCT TTTGGCCTGG 3540 

TTATTCATAG CAATGCACCA GTTCAAGTAG AAGAAAAAGA CCTTCGACTT GCTTTTCCAA 360 o 

AACTTTGGGA AGTTAAAAAG GAAGAACCAG CCAAAACATC CTTATGGAAG AAATGGTTTA 3660 
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GCTAAATCTT GCACATATTT AATAAC^CC CM TATOGC AGCCG^CGC TC CAGA T AGA 

^ CTA ^ CTKAcmv cfcQuuuwA gaaaagacag 

CTTGGATATT TTCAAATGGT AGGGGAGGTA AA^^ACc 

TAACAGGAAC TCCAACAGGG GTTCTTTTTG CAACACCTAT AGGCGC^C CCAGCAAAGC 
TTTGACTATC AAG^CCT TCTCCAACAA CAACCAAGTC AGCATCTGAA ACTTTCTTAT 
CAAAGTTGAT TAAGTCCAAG CAGGTATCAA TTCCAGACAC GATAC^CC TGAGCAAAGG 
CACACAAACC ACCAGCAAGG CC^CACCTG CW^ „ Mam 
GTGAGAATTT TTCATAAAAA rcnaa^ CCTGATCTAC GACTGCAAAC ATAGTCGGAT 
GTAGACCTTT AAAGTGTAAG TCGCACCTTG ATGACCACAT AAGGGACTCA 

CGACA TCTCC TAAAATATGA ATTTGAACAC CTTCAGGAAT TTTATAGCAA ^c^O 
AAACAGAAGC TAAGTTTAAT AAGGATTGAC CGGAAGCAGG CAAGACATTT CCATCCCTAT 
CATAAAATTG ATAACCTAAA CCAGCAGCAA TCCCCAGTCC TCCATCATTA C^CCC^c 
CACCAACAOC GATATAAATA TCTTTAATCC CTTTAGAGAT GAGATGAAGA ATCAACTCTC 
CAATACCACA AGTTTGGATT TGAAGTGGAT TAGCGGAATT TTTCCAAGAC 

CAACCAAGTC AGCTACTTCA AATAGTGCCA GTTCCCCTTT TTGAAAATAG CGCATGGCTT 

cm™** aaaaggc TCT gtcacttgga TC ca CTTTTC ^ AGG7ca agagaatgtc 

GGATAGCATC TACAGTACCT TCT CCCCCA T CACCAACAGG GCAGAGGAGA CATTCTACAT 
CTGCTATCGA TTGTTGGAAG CCTC^, 

™C TTA AA CGAATCCGGT GCAATTACAA TCTTCATATT „CCC T CA TT CTAAACAGTC 
AATCAAAGGG AGAACTTCTA AAAAATCCCT CT^CAACA TTTCTTTTTT 
GAGCACTTCT TTGGCACAAA AGGCGATTCC TAAC^CGCC GACTTCAACA TTAATAGATT 
ATTAACCCCA TCACCGATTG CCACCC^ TTCTTTAGAA AGTTTTAGTT TCTTTCTCCA 
TTTTTCCAGA GTCTCTTTTT TGACCTGGGG ACTTATAATT TGTCCAACTA ATTTrcCTGT 
TAAAAGACCT «m«C« CAAGCTAGTT GGCAGTGAAA TAGGCAATAC CAAGGGATTT 
TGCTAATCTC TCCAACTATT GGTGTAAATC CACCAGACAC CAGACCAACT AGGATGCCAT 
— CGAG AATAGAGATG AACTCTGGGA CATTTAGCGA TAGATGAATT GAGTTCAAGA 
CGTTATCAAA GACCAAAATA GGAAGACCTT CCAACAAGGA CACTC™ CTTAAACTGC 
TTTCAAAGAC CAAO^TCCT CGCA^CTC GACTTGTAAT CTGCGAAATT ^CGCCTCA* 
TCTCCCTAAA AGATCAATCA C^PeXAG GATTAAGGTT CCA^ACA, 



3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 
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4920 
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5100 

5160 
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CCAAAACACA 
CGTATGAAGT 
TATGACTTGC 
CTCCAAATGT 
CAATAAGGAG 
GCGCGAACTG 
ATAGCCATCA 
CAAGCAGCGT 
TCTGGGTAAA 
AAAAGAACAA 
AAACCTTTTT 
GGCAAGCCAA 
TGATTGAAAT 
CCATTCACGA 
CCAAACTGAC 
GCATAGAGAA 
GCGTAACGTT 
TGGAATTCTA 
CGGTATTTTT 
AGATCAGGAA 
AAGCGTTCTC 
TCAATAAAGT 
CGCCAACCAA 
TCTACCATAT 
TTTGGAAGAG 
GCTGCTACAA 
CCACAACTGG 
TGTCCAAGAT 
GGCTATCTTT 
CTACTTGACC 



CAAGCCTTTT 
CATCATACGA 
AGGCTGTATC 
CTGGATGGTT 
ATAGATTGTC 
TATCTCGCAA 
TTTGAGACAA 
AGATTTCCTG 
TCTGAGCAAA 
TCTGAACATC 
GACGTGTCAA 
CTCTTTCTTG 
GATAGTCTAA 
TACCAGATAC 
TAGTCATAAT 
TACCTGCCTT 
CAAAGCCAAC 
AATTATGAAT 
CCTTCAACAA 
TAAAGTCAAT 
CGTCATCAAA 
AGAAGGTTAC 
CGCTCACCTC 
CATAGTAGGG 
CGCCAATGAC 
ATAAAATTTT 
ATGTTCTGCA 
AGCATATTCG 
AACCAAGCTA 
TTCAATAATA 



ACTTGAGACA 
TTTTATCTAT 
CCATGAGAAG 
TCTATACAAG 
AAAGCTAAAG 
GCCTCCAACT 
GCCACACGGT 
AGCAAGTTTG 
CCATGAGAAA 
TTCTTGCAAG 
ACGAGAAACA 
CAATTTTGCC 
AAGAGCATCC 
TTTACCAGAC 
TTCATGAGCA 
CATCCAGTTC 
TCCAAACAAA 
GGTTAAAACT 
GAAAGGAATC 
CCTTTCCATA 
ATCACCGTAA 
ACCATTTAAT 
AAAATGAAGC 
TAAAATCACT 
GTCTCCCAAA 
CATGAATGAA 
GTTCCTCGAA 
ACCTGAGCCC 
TCCTTATGGA 
CTACCAGAGG 



420 
TCAGTTCTCC 



TAATTAACTA 
TCACTCTCCA 
TCCAATGCTG 
CCAGTACCGC 
TCATGGACCA 
TCAAAACGAC 
ACATCAAAAG 
GCTCCTTCAA 
ATATGGTGAA 
ATTCCCACCA 
TTATTTTTGG 
GTCTGAGGAT 
TCCATTTTAA 
TAGCTAGGTG 
AGACAGTTGT 
TCACCCAACA 
GTTTCAATGT 
ATAGCTGTAT 
GCCTCAATGG 
ACATGACCAC 
ACTGTTTTCT 
ACATCTTCAA 
GCAACTTCGT 
CCACCTGTTT 
TATCCTCTGT 
TCACAACACC 
CTTCTCCAAT 
CATGAATATT 
CAAACTGAGA 



TCTCTAAACA 
AACTATGGTA 
TAGCTTGTTT 
TTTGGAAAGT 
TTCCTTCGAT 
ATGGCAAGGT 
TTGGCATGAG 
TGATATTTGT 
AGGCTGGATC 
GACTTTCGAC 
GTGGAACGTC 
CTTTCCCAGA 
TATAAAGATC 
GAATCTGATC 
AAACGGTTGA 
TCCATCGAAG 
TTCCTTCTGA 
CCTCATAGGC 
GGTAGTCATG 
CAGCCAGTTG 
GGAAGAAATA 
TAATTCCACA 
TCTGATTTCC 
GCCCAGCTTT 
TTGAAAAGGG 
TACTTTAGCA 
ATGCTCAACT 
AACAACACGA 
ACGTGATAGA 
AGTGCTTACC 



GCCTAAAAAT 
CAAGTCAAGG 
TTGTAGGTTT 
CCAATTTAAC 
TGGATTGAAA 
TCCATAACGC 
GAAGAGGTCA 
TGATAGCTTG 
GCCAGTTCCC 
CACCACATCA 
TGCTCTAACA 
CAAATCTTCC 
AGCATCAATC 
CAAATTACAT 
AACACGGTTC 
GGTGCCATCA 
AAATTGTCCT 
TTGAATCCAA 
AACATGGAGA 
GAAAAAGGCA 
ATATTGATTG 
ATACTGTCTG 
AAATTTAGCC 
TACCAGTGAT 
TGCACCCTCT 
CCTTTCTTAA 
TCAACCCCTT 
GGGAAGAGCA 
ACAGAATTAG 
TTAGATGTAT 



5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 
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TAGCATAGTA 


AGTTGGCTCT 


TCGTTTTTGA 


CCTTTGTATA 


AATCTTTTGG 


TTTGGTGAGA 


7260 


AAAGAGAATA 


GAATTTTTGT 


GATTCAAGCA 


TATCGATATT 


CGCTTGATAA 


TAAGATTTAA 


7320 


CAGAGTGAAT 


ATTGGCTAGA 


TAGCCCGTGT 


ACTCGTAGGC 


GAAAGCTCCC 


TCTTTTACAG 


7380 


CCAAATCCCG 


TAAAACATAG 


CGCAATTTCT 


CTGGATGTTC 


TTTTTTAGCT 


TCTTCTTCCA 


7440 


AGTGTTCAAT 


CAACCAAGGT 


GTATCAACGA 


CAAAGATATC 


TGTAGACATA 


TTGAACGTTT 


7500 


CAGCTGTTGA CTTGCTATCA AAGAGTTTAT GAGAAAGAAC ATGGTCTGTT 


TCATCTACAT 


7560 


CCAAGATTGC 


ATTTACTTCT 


GAAATATCTT 


TCTTAGCTAG 


TTTTTTATAA 


ACTACAGTGA 


7620 


TAGGCTCTTT 


TGTTGTACTA 


TGTAGGTGGA 


AAACTTGGTT 


CAAATCAATG 


TTAATAAGAA 


7660 


CATCGCAGTT 


GAGGGCAACC 


GTTTGGTTTG 


AGCCAGAACG 


TTTCAAATAA 


GTAAGAAGCT 


7740 


GTTGGTAGTA 


TTCTTTTCCA 


ACTGTACTAC 


TTTCTACACG 


GGTATTGTAA 


ATTCCTAGAT 


7800 


AGTAATGGCT 


AAGAAGGGTT 


GATAAGCCCC 


ACTCGCGTCC 


TGAACGAATA 


TGGTCAAATA 


7860 


CTGAGCTGAT 


ATTATCCTGC 


TGGAAAATAC 


CAAAGACACT 


ACGAACACCT 


GCATTAGCAA 


7920 


GGCTTGAAAG 


TGGGAAGTCA 


ATCAAACGAT 


ATTTCCCACC 


AAATGGCAAA 


CTTGCTACTG 


7980 


GACGGTGGTC 


CGTCAATGTC 


GACATATTGT 


GAAAACCAAC 


TGTATTTCCT 


AAAATGGCAG 


8040 


AATATTTATC 


AATCTTCATC 


TGTTGCTACC 


CCCACTACTT 


CATTATATCC 


TACAACTTGT 


8100 


ACTTCATCTG 


TTCCATCAAT 


TTCGACACCG 


TCAGAAATAA 


TCGCACCTTC 


ACCAATAATG 


8160 


GCACGTTTAA 


TCTTAGCTCC 


TTGACCAATG 


ATAGCTCCAC 


TCATGATAAC 


TGAATCAAGG 


8220 


ACTTCCGCTC 


CTTCGCGAAC 


TTGCGCGCCT 


GTTGAAAGGA 


TAGAATGTTT 


AACAGTTCCA 


8280 


TCAACGAAAC 


ATCCGTCTAC 


AACTAATGAG 


TCTTCCACAT 


GAGCATTTGC 


CCCGAGGAAG 


8340 


TTTGGTGGTG 


AAATCAAGTT 


TCTTGAGTAA 


ATCTTCCATT 


GACGGTTACG 


ACTATCCAAG 


8400 


GCATTTTCTG 


GAGAAATATA 


CTCCATGTTC 


GCTTCCCAAA 


GTGACTCAAT 


AGTACCAACA 


8460 


TCTTTCCAAT 


AACCACTAAA 


TTCGTAAGCA 


TAAACACTTT 


CACCTGACTC 


AAGGTAATTT 


8520 


GGAATGACAT 


TTTTACCAAA 


GTCTGACATG 


CCAACCTTGC 


TCTTTTCAGC 


AGCGACTAAC 


8580 


ATATTACGAA GGCGTTGCCA ATCAAAAATG TAGATTCCCA TAGAAGCTTT TGTAGATTTA 


8640 


GGTTGAGCTG 


GTTTTTCTTC 


AAATTCAACA 


ATACGATTGT 


TAGCATCTGT 


GTTCATGATA 


8700 


CCAAAACGGC 


TTGCTTCTTT 


AAGAGGGACG 


TCTAAAACTG 


CTACTGTCAA 


GCTGGCATTA 


8760 


TTATCCTTAT 


GAGACTGGAG 


CATATCATCA 


TAGTCCATTT 


TGTAGATGTG 


ATCCCCAGAC 


8820 


AAAATCAAGA 


CATACTCAGG 


ATTGACACTG 


TCGATATAGT 


CGATATTTTG 


GTAAATAGCG 


8880 


TGACTAGTCC 


CCTCAAACCA ACGATTTCCT TCACTTGCAG AATAAGGTTG AAGAATAGAG 


8940 
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ACACCTGAAT 
GCAAGTGGTT 
GATAGGGCAA 
CTTTGAGTGA 
TCATTTTTCA 
GCGACGTTTG 
CTCATAATCT 
GCCTCCCCAC 
TCCGATTGTA 
TCCCTTTTTA 
GATTTCAATA 
CTGGTTTAGC 
CCATTCCAAC 
GAGCAATTTC 
TTGATTGTAA 
ATCGTGCGAG 
CAGGTTAAAG 
CATCCAGCCC 
AATCTTGATC 
AATAACCTCA 
TTTATTAGGT 
CACACGAATA 
GGACTGGACT 
CTTATTATGG 
GTTGATGGTA 
TCCTCGACAA 
CCCATAAGCT 
ATATGAGTAT 
CTATAAGGAC 
ACAGGACGCT 



TAATACCGTC 
GATACTGTGT 
AGTCAATGAT 
GTTTACCGAG 
TTTTCTACTC 
ATTTTCCATA 
TTCCATAGTC 
TCTTCCAACT 
AAATCTTTCC 
CCCTTACGAA 
CCATCATAGC 
TGAGAAGCGA 
TGTTCTTCAG 
TTACCAGGGT 
CGATCTCCCC 
AATGGCAAGA 
TCATATTTAC 
ATGTTCCATT 
GCAGACGAAC 
TTCAAGCGCT 
GTCCATGGAG 
CCATCCAAAT 
TCATTTTTTC 
TCTTGGTATT 
AAGTGACTGG 
AATCTTGAAA 
GATACCCCCA 
AGTTCATTTC 
TGCCATCAGA 
CTTCAAAGCC 



TAGTCCCCAG 
AACGACCCCA 
ACGGTAGCGC 

acgagttcct 
ctttttggtt 
cacttgctcc 
cttcttgcgt 
cagtattcca 
gctcaacagg 
taaaggaaag 

TGGTATCAAT 
AATACTTCAT 
ATTTCCATTC 
GACAAATTTG 
ACATCTTATG 
GATAATTCTC 
GATAGATCGG 
TGTAGTCAAA 
TTTCTTCTGC 
GAAGGAAATA 
CATCATCATA 
GATAGACATC 
CAAGGTCAAA 
CAAAAGTCGG 
TACCCAGTCC 
CTCCTCTGGT 
ACTCAAGCCC 
AACGAGATAA 
ATTTCTTTTC 
CCAACGTTTT 



422 
CTTGAACCAT 



ACATTGTGAA 
CCACCAAATT 
TGCCCACCAG 
TTTATTTGTG 
CATAGCCGGT 
TTGAACAGTT 
TACTTCTTCG 
TACCATATTA 
AACACTCTGG 
TTCCCACAGA 
CTTAGCATTC 
TAGGAATTGA 
GTACGTATAG 
CATCATACTC 
CTTGAAAACA 
ATCTTCTTCG 
TCCTAGACCA 
AATCATCATC 
ATAACCTTCA 
GTCCAAATAG 
AATCCAATGC 
ATTAAGGGCA 
TGTCCCATCA 
ACAATAACCC 
CGGCCATAAG 
AAAGGATGGG 
GGAATGAGTT 
CATGATCCAG 
CTTCGTGCCA 



TCCCAATATG 
TCCCTGAGTT 
GCACAGCTGG 
CAAGAATCAA 
ACGGTTTTAG 
AGGGTAAAGG 
TGATTATGTT 
TAAATTCCTG 
AAGATACAGA 
TCTCGATTAT 
CAGCGATGAT 
ATTGGGTCTT 
CCGTATTCGC 
AGATTGCGCA 
TTCTTGCCAT 
TACATAAAGC 
TAGAAACGGA 
CCAATCTCTT 
ACATCTGGAT 
TAGTTGAGAT 
AGCATGTTGC 
TTAATGCAAG 
CCCCAACCAT 
TAATAGGCTA 
CAATATTATG 
CATGCTCTAA 
ACATCAAGGG 
CATCCTTGAG 
CGTGAACTTC 
GCCAAAGTCC 



GTTGTTGAGA 
GGCACAGTTT 
TTTTGCGATG 
AGCTAACATT 
TAGATTTCAA 
TTAAGGTCTG 
CTTTCCAAAC 
CAACGGGTAG 
CTAACATTTC 
CCGCATCAAT 
CTTTGTAAAA 
CTAGGTTAGA 
TACCCATGAA 
AGCCTGCGAA 
GAACCACTTC 
TGAAAGTCAC 
GGATATCATT 
TCATTCCCGT 
ATTCTAACTT 

ttccgccatc 
taacagcatc 

AAATTAAGAA 
GGTTATGAGC 
AGGCATCATC 
GGTATGACAC 
AGCGAAGTAA 
CATAAACTCA 
CTGGGCAAAA 
ATAAATATTG 
ATCCTTCCAT 



9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
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TTCTTCTCAG GAAGCTCTGT TACGATTGCC CCTGTTCCTG 
GCAAAAGGGT CAATCTTCAT CAGTTGATGA CCATTTTGAC 
ATATGCCCTT CTTGAGCCAT ATTGGTAAAG ACTTCCCAGA 
ATTGGAATCT GATTTTCAAT CCAGTTGGTA AAATCACCAA 
TTAGGTGCCC AAACACGGAA GGTATAGCCA TGCTCTCCAT 
CCTAGATAAT GTTGGAGATA AAAATTTTCA CCCGTCATAA 
TTATCCATAT ACTCCCCTTC TCCTGTAAGC GTTTTCTATG 
TAGAGAAGAT TCAAGTAAAT TACTATACTT CTTTAATTAT 
TCACTTACTC GTTCAATTGT AAATCAATAT TTTTTCAAAA 
TTTTCTACTA TAGTGAAATG AAATAAAACA TGCGCAAATC 
TTTCTAACAA TGTCTTAGAA ATCAAAGTGT ACTATTTTAA 
(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 7577 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 



GACGAGCCTC 
GTGTGACATG 
CCCCAAAATC 
CCAAGTGAAC 
TTAGTTCTTC 
AGGTTTTTAA 
TTTTTATTAT 
TTTGAAAATC 
AATTGCGAAA 
GATTAAGGAA 
CTCC 



ATACCTGACA 
ATATTTGTAA 
ATTTCTTACC 
AGCCTGAGCA 
CCTATGTGCT 
TGCTTCTCTA 
ACTACCTTTT 
TACAACAAGT 
ACGCCTTTCT 
TTTAATCTAA 



10800 

10860 

10920 

10980 

11040 

11100 

11160 

11220 

11280 

11340 

11384 



Ui> SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
TGTTGATTTG TTACTAGACG TTGACCAACG TCCTTCGGCT GGAAAAGGAA 
TTTCCAACAC GTTTTCGCCA TGTTTGGTGC GACCATCTTG GTACCATTGA 
GCCTGTATCT GTTGCCCTTT TTGCTTCAGG TGTTGGAACA CTCATCTACA 
TGGTTTTAAA GTTCCAGTTT ATCTAGGTTC TTCATTTGCC TTTATCACAG 
GGCTATGAAA GAAATGGGGG GGGATGTATC TGCTGCCCAA ACAGGGGTTA 
TTTGGTCTAT GTCCTTGTTG CTACCAGCAT CCGATTTGTA GGAACAAAAT 
ACTCTTGCCA CCAATCATTA TCGGTCCTAT GATCATCGTT ATCGGTCTTG 
TTCAGCTGTT ACCAATGCAG GTCTTGTAGC AGACGGAAAT TGGAAAAATG 
CGTTGTTACT TTCCTAATTG CTGCCTTTAT CAATACAAAA GGAAAAGGCT 
CATTCCATTC CTCTTTGCCA TTATCGGTGG TTACCTTTTC GCACTAACTC 
TGACTTTACA CCAGTTCTTA AAGCCAACTG GTTCGAAATT CCTGGTTTCT 
TAGCACAGGT GGTGCCTTTA AAGAGTACAA TCTTTACTTT GGTCCAGAAG 



TTCTCCTTAG 
TTTTGGGAAT 
TGATTGCTAC 
CTATGTCACT 
TCTTGACTGG 
GGATTGATAA 
GACTTGCAGG 
CTCTGGTAGC 
TCCTACGAAT 
TTGGCTTGGT 
ACTTGCCATT 
CCATCGCTAT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
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CTTGCCAATC 
AATCTGTGGT 
TATCGCAACT 
TACAGGGGTT 
CATCGCGATT 
CGCTGTACTT 
AGTCTTGATT 
TATGTTGGTT 
TACTGCCCTT 
AGACTAAGAG 
ATAATAAAAG 
AAAACCTTGA 
GATTTGCATC 
CATACGTTTC 
AAACATCTGG 
GCCTTGTCCC 
TTTAGGAAGG 
CATAACACCC 
GACGAATGTA 
TGGGTTTTCA 
GCGTTCATCC 
GAGCAAGTCT 
AAAATCAAAG 
TTCCTGAGAA 
CATGCGGTCT 
GATTGGTTTT 
AATCTTGGTA 
GGCTGCTTCC 
TGCTTTCACA 
ATCAATCAAG 



GCTATCGTAA 
CGTCAATTCT 
TCTGTTTCTG 
ATCGGTATGA 
GCCCTCAGCT 
GGTGGTATGT 
AAAGAACGTG 
CTTGGACTTG 
TCAGCCATGA 
TCTAAATACA 
TGTCTTAACA 
TACTGTACGT 
AATCTCTTGT 
ATGGACTGTT 
CTCATATCTG 
ATCATTCCTT 
TTATTTGGAT 
TGCATGAGCT 
TTTCCAGAAC 
CGCTCTTCAG 
ACCTTCATGT 
TCCATCGGCC 
GTGTTTTCGC 
GCTTTCTCAA 
GGGTGGAAGG 
CCAGTAATGT 
AGGATGACCC 
TGACCAATCA 
TCACGAAGCT 
ACATAGTCGT 



CAATTTCTGA 
TAAAAGAACC 
CCTTCCTTGG 
CTCGTATCGC 
TCCTTGGTAA 
CAATCCTTCT 
TTGATTTCGC 
GAGGAGCTAT 
CAGGAATCAT 
CCTAATCCAC 
AAATTATTAA 
TTTATCATAG 
CTTACTTGCG 
TCATGGCAAA 
GCATTCCTGC 
CAAGGGCAGA 
TAATCCCCAT 
GTTTAGCCTG 
CAGCAGCAAT 
GTGTCATCGA 
TTTGAAGGGC 
CCATATTTTG 
GCATCTTCTC 
TCAAAGTGAG 
TTTCAATGTC 
GACGAACAGA 
CAGTCACTTC 
TAGCATCAAC 
CATTCATGAG 
TATGATTAGT 



424 
ACATATCGGA 



AGGTCTTCAC 
TGGACCAGCC 
TTCTGTCTCA 
ATTCACTGCC 
CTATGGGGTT 
TCAAATGCGA 
CCTTAAACTT 
CTTGAACTTG 
TCAGACAGCT 
AATCAAAAAA 
AAATTTTTAC 
TTTCTTCTTC 
TTCACCAATT 
TCCTCCGAGA 
CATATCCATT 
TTGCTTCATC 
GTTAAAGTCC 
ACGACGGCGA 
AGACACAATG 
TGGATTGTTG 
CACCTGATCT 
AGCCATTTCA 
CATATCCCCC 
CGTAATCTTT 
CAGAGCAGCA 
CAACTGAGCA 
GACAAGCAAG 
GAGCTCATCA 
TTGGGCTTGC 



GACCATACTG 
CGTACTCTTC 
AATACAACTT 
GTTATCCGTA 
TTGATTTCAA 
ATCGCCAGCA 
AACCTCATCA 
GGTCCAGTTA 
ATCTTGCCAT 
GAGTGGATTT 
CGTATAATAT 
TTTATTTTCT 
GCTTTCTTCA 
TTACCTTTCA 
GCTGATAAGT 
CCTCCCATAT 
ATTTTATTCA 
TTGATGAATT 
CGGCTTGGAT 
GCACGTTTAC 
GCCATACCTG 
AATTGATCGA 
AGGGCTTTTT 
ATACCAAGGA 
TCACCTGTAC 
CCACCACGAG 
TTAAACTCAC 
ATTTCATTTG 
ATCTGCAAAC 
TCCAAACCTT 



TTTTGGGTCA 
TTGGTGACGG 
ACGGAGAAAA 
ACGCTGCCTT 
CTATTCCAAA 
ATGGTTTGAA 
TCGCAAGTGC 
CACTTTCAGG 
ACGAAAATAA 
TTCGTATACC 
CAGATATTCT 
CATCAAATGA 
TTTTGTTAGC 
AACCGCCACC 
CAGGCATACC 
TTGGCATATT 
TATCCCCAGA 
TATTGACTTC 
TTAACAAATC 
GAGCAATCTG 
GAATCATCTT 
TGAAATCATT 
GTTCATCGTA 
TACGGCTAGA 
CAGTGAACTT 
TATCGCCATC 
GCGCAACATT 
GTTGAGCCAA 
GACCCGCAGT 
GACGTACAAT 



780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
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CTCAACAGCT 


GGTACTTCTG 


TTCCAAGTGC 


AAAGACAGGC 


ACATCAATCT 


GTTGTCCCAA 


2580 


GGTCTTAAGC 


TGGTCAATGG 


CAGCTGGACG 


ATAAATATCC 


GCCGCAATCA 


TCAAAGGACG 


2640 


AGCATTTTCT 


TCTTTCTTGA 


GTTTGTTGGC 


CAATTTACCA 


GCAAAGGTTG 


TTTTACCAGC 


2700 


CCCTTGTAAA 


CCAACCATCA 


TGATGATGGT 


TGGAATCTTA 


GGTGACTTGA 


TAATTTCTGC 


2760 


CGTATCAGAA 


CCTAAAACGG 


CTGTCAATTC 


CTCATCAACG 


ATTTTAATAA 


TCTGTTGCGC 


2820 


AGGATTAAGT 


GTATCAATGA 


CCTCATGCCC 


GACTGCACGC 


TCACGAACTT 


TCTTGATAAA 


2880 


GTCCTTTACA ACAGGCAAGG 


CAACGTCGGC 


CTCGAGCAAG 


GCCAAGCGAA 


TTTCTTTGGT 


2940 


TGCCTCTTGG 


ACATCAGATT 


CAGAGATTTT 


TCCTTTTTTA 


CGTAGATTTT 


TAAAGACGTT 


3000 


CTGCAAACGT 


TCTGTTAAAC 


TTTCAAATGC 


CATTTTTCTT 


CCTCTTATTC 


TCTATTATCA 


3060 


ATGCTTGTTA 


AAATTTCTAT 


CTGCTCCTGC 


AGAAAGTCAT 


CCTTGGGATA 


GCGCTCCAAA 


3120 


A rcTGATCAA 


AAATCTGACT 


GCGGACAATA 


TAGTCCGAGT 


ACATGTGCAA 


TTTCATCTCA 


3180 


i AA 1 C i I CCA 


GAATCTTTTC 


TGTTCGCTTG 


ATATTGTCAT 


AGACAGCCTG 


ACGACTGACA 


3240 


V- v-k»>\/\\^ i WC X' 


CCCCAA1 1 TC 


AGCAAGGCTG 


TAATCATCAG 


CGTAGTAGAG 


CTCGATATAA 


3300 




TATCTGTCAA 


AAGCGCCGCA 


TAAAATTCAA 


AGAGCGCATT 


CATACGATTG 


3360 


GTTTTTTCGA 


TTTCCATAAC 


TTTTATTATA 


CCAAAAATTA 


GCCTAATCTA 


CCACACTAGG 


3420 


AAGCCGATCC 


AAGAAGATAG 


ATAGCTAAAT 


TTGAAAAAGA 


CATGAGCCTA 


GCCCCAAGTA 


3480 


ATTTCCAATT 


GATAGCTGGC 


AAAGGGATGT 


CCCTCTTGAT 


TTTGTAGTTG 


ATAATCTAGT 


3540 


TCAATCTTTT 


GCCTATCAAC 


TTGATAATGG 


CTCGTTTGGA 


TGATAAACTC 


CTGCATGCCC 


3600 


ATAGGTGTAG 


GAATATAGGC 


TAAACTATCG 


CTATCCTTTA 


GAAAGCGCAT 


AATGGTCTTG 


3660 


GGATTAGAAA 


ATCGGCTCAT 


CACAAGTTCT 


TGACCATGAA 


ATTTAATCAC 


TACTTTTTCC 


3720 


TTTTCCTCAT 


TATAGAAAAG 


CAGGTAGCTA 


TAATCTCCTT 


TTTCATGCAC 


TTCCACATCA 


3780 


TAAAGCTGGT 


CAATCACTTC 


CAACTGCTCA 


TCAAACTGAA 


TCGTATTTCG 


CATCCGAATC 


3840 


TTCACATCAG 


GCCCTCTTTC 


TTGTCTCTTG 


TCCTACTATT 


TTACCAAAAA 


GAGCAGGATT 


3900 


TTGCTATAAT 


GGTCATATGA 


ACGAAAAAGT 


ATTCCGTGAC 


CCTGTTCACA 


ACTACATCCA 


3960 


TGTCAATAAT 


CAAATCATCT 


ATGACTTGAT 


TAATACAAAA 


GAATTTCAGC 


GTTTGCGCCG 


4020 


GATCAAACAA 


CTGGGAACTT 


CCAGTTATAC 


CTTCCACGGT 


GGAGAACACA 


GTCGCTTCTC 


4080 


TCACTGTCTA 


GGAGTCTATG 


AAATTGCACG 


ACGCATCACA 


GAGATTTTCG 


AAGAAAAATA 


4140 


TCCTGAGGAA 


TGGAATCCTG 


CCGAGTCTCT 


CTTGACCATG 


ACCGCTGCTC 


TCCTACACGA 


4200 


CCTTGGGCAT 


GGTGCCTACT 


CCCATACTTT 


TGAACATCTC 


TTTGATACAG 


ACCATGAAGC 


4260 



WO 98/18931 



PCT7US97/19588 



CATTACTCAG 
GGCACCTGAT 
GGTCGTGCAG 
CTCCTATTTT 
TCGTCCTATC 
CGTCCTCAGT 
GGAAGTTCTC 
TTTCTTTGCC 
TGACTATCTG 
TCCTGACAAG 
CATTACCTTT 
TATCGGCTTT 
TATCTATCGT 
AGAACTGGCC 
CGGAGATAAT 
CATTACCCAG 
GAAGAGGAAA 
CAGCCAAAAG 
TGTCAAAGTC 
CTTGCAGTTG 
AACTGCTACA 
GGAATTCCTC 
TACTGCAAAT 
TATCTTCTAC 
CGATGAACCA 
CTACTCCATC 
GGGTTCAGCC 
AATCGGTGAT 
GGAAAATGGA 
ATCCGGCGTT 



GAG ATT ATT C 
TTCCCAGAAA 
CTCATTTCTA 
ACAGGAGCAT 
GAAAATGGTA 
CGCTACCAGA 
CTACAGAATC 
CGAACTTCTC 
GCTCTGGATG 
ATTCTTGCAG 
TCACAAGAGG 
GATCCCGACT 
CCCGAATCTG 
GAACTCTCTA 
CGCTTTTATT 
CAATTTTTAC 
TTTATGAGTA 
GAAATCACTC 
GTGATTGCAA 
AGAGACGAGG 
GGACATGAGA 
AGTCGCAAGC 
CGCAATATCG 
CGTACCCCTG 
GAAATTCTCG 
AACAAATCTG 
ATTACTCACT 
GAAGAAAATG 
AATCCAGAAA 
GCCCATGCCA 



426 



AAAATCCTGA GACAGAGATT CACCAAGTCC TGCTACAAGT 
AGGTGGCCAG TGTCATTGAC CATACCTATC CTAATAAGCA 
GTCAGATTGA CGCAGATCGC ATGGACTATC TCTTGCGCGA 
CCTATGGGGA ATTTGACCTG ACTCGAATCC TCCGAGTCAT 
TCGCCTTTCA GCGCAATGGC ATGCACGCCA TCGAAGACTA 
TGTACATGCA GGTTTATTTC CACCCCGCAA CACGCGCCAT 
TTCTCAAACG CGCCAAGGAA CTCTATCCTG AGGACAAGGA 
CACACCTCCT GCCTTTCTTC GAAAAAAATG TGACCTTGAC 
ATGGCGTGAT GAATACCTAC TTCCAGCTTT GGATGACCAG 
ATTTATCGCA TCGCTTTGTC AACCGCAAGG TCTTTAAATC 
ACCAAGATCA ACTTACTAGC ATGAGAAAAT TGGTTGAGGA 
ACTACACTGC CATTCATAAG AACTTTGACC TCCCTTATGA 
AAAACCCACG GACACAGATT GAGATTTTAC AAAAAAATGG 
GCCTGTCTCC TATCGTCCAA TCCCTTGCTG GCAGTCGCCA 
TTCCAAAAGA AATGTTGGAC CAAAACAGCA TCTTTGCAAG 
ACTTGATTGA GAACGATCAT TTTACCCCAA ATAAAAACTA 
TTAAACTAAT TGCCGTTGAT ATCGACGGAA CCCTTGTCAA 
CTGAAGTTTT TTCTGCCATC CAAGATGCCA AAGAAGCTGG 
CTGGCCGCCC TATCGCAGGC GTTGCCAAAC TTCTAGACGA 
GGGACTATGT GGTAACCTTC AACGGTGCCC TTGTCCAAGA 
TTATCAGCGA ATCCTTGACT TATGAGGATT ATCTAGATAT 
TCGGTGTCCA CATGCATGCC ATTACCAAGG ACGGTATCTA 
GAAAATACAC TGTACACGAA TCAACCCTCG TCAGCATGCC 
AAGAAATGGC TGGCAAAGAA ATTGTTAAAT GTATGTTTAT 
ATGCTGCGAT TGAAAAAATT CCAGCAGAAT TTTACGAGCG 
CTCCTTTCTA CCTCGAACTC CTTAAAAAGA ATGTAGACAA 
TGGCTGAAAA ACTCGGATTG ACCAAAGATG AAACCATGGC 
ACCGTGCCAT GCTGGAAGTC GTTGGAAACC CCGTTGTCAT 
TCAAAAAAAT CGCCAAATAC ATCACCAAAA CAAATGACGA 
TCCGAACATG GGTACTGTAA AAGTATCATT TTTCAATAAG 
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AATTGATTAG 


CAATAAAATC 


CAATGAATTT 


TTTTAGCAAA 


CTATTTAATT 


TAAAACAAAA 


6120 


TAATCATAAT 


AGAGACACAA 


ATTCTGATTG 


TAACAATTTT 


TACCTAAACG 


AATTAGAATG 


6180 


TGGCCTTACT 


CCTGGGCAAC 


TCATACTCAT 


AGATTGGACT 


CAAAAAACAG 


GGAGAAATTA 


6240 


TAATTTCCCA 


AGATATTTTA 


AATACTCTCT 


TCAAATTGAC 


CCTGAATCTA 


CACACAATCA 


6300 


ATTATACAAA 


TTAGGATACT 


TCACTAAAAA 


TAAGACTTTA 


TCATATCTTA 


CAGTAGTAGA 


6360 


ATTAAAAACT 


ATATTATCTA 


AACATAATTT 


AGCTACTTCT 


GGAAAAAAAG 


CAGAATTAAT 


6420 


TACAAGAATA 


ATTAATAATG 


TTAACATTGA 


CAATTTAGAT 


ATTCCGTTCG 


AATTTAAACT 


6480 


AACAAAAGAA 


GCACAAAATC 


TTATTATCGA 


ACATAGTGAC 


TATATCAAAG 


CATACTATGA 


6540 


TAAAGACATA 


ACTATGGAAG 


ATTATTGTAA 


AGAAAAAAAC 


AATATCTCTT 


TTAAAGCAAC 


6600 


TTTTGGTGAT 


ATAAAATGGA 


GTCTCTTAAA 


TAAACAAGCT 


CATAGGAATA 


CTGTATCAGG 


6660 


AGATTTTGGA 


TGCTTATCTA 


ACACACGAAA 


GGCTCAGGGA 


AGACATTTGG 


AACAAGAAGG 


6720 


TAATATTAAA 


CATGCTTTAA 


TATATTACAT 


AGAATCTTTG 


ATAATTACTA 


TTTCAGGATT 


6780 


AGAAAACAAT 


TTTTCAGCCA 


CTGATTATCC 


AGTATATTAT 


CCCGATTCGA 


TACCTGACTA 


6840 


CTCACTAAAA 


CATATTCAAA 


CATTAATGGA 


ATCATTATCT 


GATGACGATT 


ATGATTTTGC 


6900 


TTTTGATGAA 


GCATTATTTC 


GCTTCTCAAT 


TTTGAATGCA 


AATCATTTTT 


TATCTAAGGA 


6960 


AGATATTGAC 


TATTTAAGAG 


TTAATTTACC 


TCGTTCCACT 


GCTGAAGAAA 


TAAACAATTA 


7020 


CTTAAAGAAA 


TATGAATGTT 


ATAGTCCTTT 


AAATAATTTA 


GAACTTGACG 


ATTTTGAATA 


7080 


AATTGACTAT 


ACAAACATTT 


ATATACTCGA TATAGTCTCA ATTTTATCTG 


ATGATTGCCC 


7140 


AAATTTTTCA 


ATAATAAAAC 


GCATAATATT 


ATGGAGACAA 


TCCCCTATAT 


TATGCGTTCT 


7200 


TTTAATATCA 


AAGACTTTTT 


GACAAACTTC 


TTTGATATCT 


AATTACATGC 


CCCCTGCAGG 


7260 


AATCGAACCT 


GCAACTACTC 


CTTAGGAGGG 


AGTTGTTATA 


TCCATTGAAC 


TAAGGGAGCT 


7320 


AGATAAAAAC 


TCTGCTAAAT 


GAGCAGAGTT 


TTTTAGTCGA 


ATTAACGACG 


GATTTCTTTG 


7380 


ATACGAGCTG 


CTTTACCTTG 


AAGAGCACGC 


AAGTAGTACA 


ATTTCGCACG 


ACGTACTTTA 


7440 


CCGTAACGAA 


CAACTTCGAT 


TTTTTCAACA 


CGTGGAGTGT 


GGATTGGGAA 


GATACGCTCA 


7500 


ACACCTACAC 


CGTTAGAGAT 


TTTACGAACT 


GTGTAGTTTT 


CTGAGATTCC 


AGCACCTTTA 


7560 


CGTGCGATAA 


CAACACG 










7577 



(2) INFORMATION FOR SEQ ID NO: 47: 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4945 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
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(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

CCTCGCTGAT GATTGGTGCT GTTTTATTTG CTGGTCCAGC CTTGGCTGAA GAAACTGCAG 60 

TTCCTGAAAA TAGCGGAnCT AATACAGAGC TTGTTTCAGG AGAGAGTGAG CATTCGACCA 120 

ATGAAGCTGA TAAGCAGAAT GAAGGGGAAC ATGCTAGAGA AAACAAGCTA GAAAAGGCAG 180 

AAGGAGTAGC GATAGCATCT GAAACTGCTT CGCCAGCAAG CAATGAAGCT GCAACTACTG 240 

AAACTGCAGA AGCAGCTAGC GCAGCTAAAC CAGAGGAAAA AGCAAGTGAG GTGGTTGCAG 300 

AAACACCATC TGCAGAAGCA AAACCTAAGT CTGACAAGGA AACAGAAGCA AAGCCCGAAG 360 

CAACTAACCA AGGGGATGAG TCTAAACCAG CAGCAGAAGC TAATAAGACT GAAAAAGAAG 420 

TCCAGCCAGA TGTCCCTAAA AATACAGAAA AAACATTAAA ACCAAAGGAA ATCAAATTTA 480 

ATTCTTGGGA AGAATTGTTA AAATGGGAAC CAGGTGCTCG TGAAGATGAT GCTATTAACC 540 

GCGGATCTGT TGTCCTCGCT TCACGTCGGA CAGGTCATTT AGTCAATGAA AAAGCTAGCA 600 

AGGAAGCAAA AGTTCAAGCC TTATCAAACA CCAATTCTAA AGCAAAAGAC CATGCTTCTG 660 

TTGGTGGAGA AGAGTTCAAG GCCTATGCTT TTGACTATTG GCAATATCTA GATTCAATGG 720 

TCTTCTGGGA AGGTCTCGTA CCAACTCCTG ACGTTATTGA TGCAGGTCAC CGTAACGGGG 780 

TTCCTGTATA CGGTACACTC TTCTTCAACT GGTCTAATAG TATTGCAGAT CAAGAAAGAT 840 

TTGCTGAAGC TTTGAAGCAA GACGCAGATG GTAGCTTCCC AATTGCCCGT AAATTGGTAG 900 

ACATGGCCAA GTATTATGGC TATGATGGCT ATTTCATCAA CCAAGAAACA ACTGGAGATT 960 

TGGTTAAACC TCTTGGAGAA AAGATGCGCC AGTTTATGCT CTATAGCAAG GAATATGCTG 1020 

CTAAGGTAAA CCATCCAATC AAGTATTCTT GGTACGATGC CATGACCTAT AACTATGGAC 1080 

GTTATCATCA AGATGGTTTG GGAGAATACA ACTACCAATT CATGCAACCA GAAGGAGATA 1140 

AGGTTCCGGC AGATAACTTC TTTGCTAACT TTAACTGGGA TAAGGCTAAA AATGATTACA 1200 

CTATTGCAAC TGCCAACTGG ATTGGTCGTA ATCCTTATGA TGTATTTGCA GGTTTGGAAT 1260 

TGCAACAGGG TGGTTCCTAC AAGACAAAGG TTAAGTGGAA TGACATTTTA GACGAAAATG 1320 

GGAAATTGCG CCTTTCTCTT GGTTTATTTG CCCCAGATAC CATTACAAGT TTAGGAAAAA 1380 

CTGGTGAAGA TTATCATAAA AATGAAGATA TCTTCTTTAC AGGTTATCAA GGAGACCCTA 1440 

CTGGCCAAAA ACCAGGTGAC AAAGATTGGT ATGGTATTGC TAACCTAGTT GCGGACCGTA 1500 

CGCCAGCGGT AGGTAATACT TTTACTACTT CTTTTAATAC AGGTCATGGT AAAAAATGGT 1560 

TCGTAGATGG TAAGGTTTCT AAGGATTCTG AGTGGAATTA TCGTTCAGTA TCAGGTGTTC 1620 
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TTCCAACATG GCGCTGGTGG CAGACTTCAA CAGGGGAAAA ACTTCGTGCA GAATATGATT 
TTACAGATGC CTATAATGGC GGAAATTCCC TTAAATTCTC TGGTGATGTA GCCGGTAAGA 
CAGATCAGGA TGTGAGACTT TATTCTACTA AGTTAGAAGT AACTGAGAAG ACCAAACTTC 
GTGTTGCCCA CAAGGGAGGA AAAGGTTCTA AAGTTTATAT GGCATTCTCT ACAACTCCAG 
ACTACAAATT CGATGATGCA GATGCATGGA AAGAGCTAAC CCTTTCTGAC AACTGGACAA 
'ATGAAGAATT TGATCTTAGC TCACTAGCGG GTAAAACCAT CTATGCAGTC AAACTATTTT 
TCGAGCATGA AGGTGCTGTA AAAGATTATC AGTTTAACCT AGGACAATTA ACTATCTCGG 
ACAATCACCA AGAGCCACAA TCGCCGACAA GCTTTTCTGT AGTGAAACAA TCTCTTAAAA 
ATGCCCAAGA AGCGGAAGCA GTTGTGCAAT TTAAAGGCAA CAAGGATGCA GATTTCTATG 
AAGTTTATGA AAAAGATGGA GACAGCTGGA AATTACTAAC TGGCTCATCT TCTACAACTA 
TTTATCTACC AAAAGTTAGC CGCTCAGCAA GTGCTCAGGG TACAACTCAA GAACTGAAGG 
TTGTAGCAGT CGGTAAAAAT GGAGTTCGTT CAGAAGCTGC AACCACAACC TTTGATTCGG 
GTATGACTGT AAAAGATACC AGCCTACCAA AACCACTAGC TGAAAATATC GTTCCAGGTG 
CAACAGTTAT TGATAGTACT TTCCCTAAGA CTGAAGGTGG AGAAGGTATT GAAGGTATGT 
TGAACGGTAC CATTACTAGC TTGTCAGATA AATGGTCTTC AGCTCAGTTG AGTGGTAGTG 
TGGATATTCG TTTGACCAAG CCACGTACCG TTGTTAGATG GGTCATGGAT CATGCAGGAG 
CTGGTCGTGA GTCTGTTAAC GATGGCTTGA TGAACACTAA AGACTTTGAC CTTTATTATA 
AAGATGCAGA TGGTGAGTGG AAGCTAGCTA AGGAAGTCCG TGGTAACAAA GCACACGTGA 
CAGATATCAC TCTTGATAAA CCAATCACTG CTCAAGACTG GCGCTTGAAT GTTGTCACTT 
CTGACAATGG AACTCCATGG AAGGCTATTC GTATCTATAA CTGGAAAATG TATGAAAAGC 
TTGATACTGA GAGTGTCAAT ATTCCGATGG CCAAGGCTGC AGCCCGTTCT CTAGGCAATA 
ACAAGGTACA AGTTGGCTTT GCAGATGTAC CGGCTGGAGC AACTATTACC GTTTATGATA 
ATCCAAATTC TCAAACTCCG CTCGCAACCT TGAAGAGCGA AGTTGGAGGA GACCTAGCAA 
GTGCACCATT GGATTTGACA AATCAATCTG GTCTTCTTTA TTATCGTACC CAGTTGCCAG 
GCAAGGAAAT TAGTAATGTC CTAGCAGTTT CCGTTCCAAA AGATGACAGA AGAATCAAGT 
CAGTCAGCCT AGAAACAGGA CCTAAGAAAA CAAGCTACGC CGAAGGGGAG GATTTGGACC 
TTAGAGGTGG TGTTCTTCGA GTTCAGTATG AAGGAGGAAC TGAGGACGAA CTCATTCGCC 
TAACTCACGC AGGTGTATCA GTATCAGGTT TTGATACGCA TCATAAGGGA GAACAGAATC 
TTACTCTCCA ATATTTGGGA CAACCGGTAA ATGCTAATTT GTCAGTGACT GTCACTGGCC 
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AAGACGAAGC AAGTCCGAAA ACTATTTTGG GAATTGAAGT AAGTCAGGAA CCGAAAAAAG 3420 

ATTACCTAGT TGGTGATAGC TTAGACTTGT CTGAAGGACG CTTTGCAGTG GCTTATAGCA 3480 

ATGACACCAT GGAAGAACA? TCCTTTACTG ATGAGGGAGT TGAAATTTCT GGTTACGATG 3540 

CTCAAAAGAC TGGTCGTCAA ACCTTGACGC TTCATTACCA AGGCCATGAA GTTAGCTTTG 3600 

ATGTTTTGGT ATCTCCAAAA GCAGCATTGA ACGATGAGTA CCTCAAACAA AAATTAGCAG 3660 

AAGTTGAAGC TGCTAAGAAC AAGGTGGTCT ATAACTTTGC TTCATCAGAA GTAAAAGAAG 3720 

CCTTCTTGAA AGCAATTGAA GCGGCCGAAC AAGTGTTGAA AGACCATGAA ACTAGCACCC 37 80 

AAGATCAAGT CAATGACCGA CTTAATAAAT TGACAGAAGC TCATAAAGCT CTGAATGGTC 3840 

AAGAGAAATT TACGGAAGAA AAGACAGAGC TTGATCGCTT AACAGGTGAG GTTCAAGAAC 3900 

TCTTGGCTGC CAAACCAAAC CATCCTTCAG GTTCTGCCCT AGCTCCGCTT CTTGAGAAAA 3960 

ACAAGGCCTT GGTTGAAAAA GTAGATTTGA GTCCAGAAGA GCTTACAACA GCGAAACAGA 4020 

GTCTAAAAGA TCTGGTTGCT TTATTGAAAG AAGACAAGCC AGCAGTCTTT TCTGATAGTA 4080 

AAACAGGTGT TGAAGTACAC TTCTCAAATA AAGAGAAGAC TGTCATCAAG GGTTTGAAAG 4140 

TAGAGCGTGT TCAAGCAAGT GCTGAAGAGA AGAAATACTT TGCTGGAGAA GATGCTCATG 4200 

TCTTTGAAAT AGAAGGTTTG GATGAAAAAG GTCAAGATGT TGATCTCTCT TATGCTTCTA 42 60 

TTGTGAAAAT CCCAATTGAA AAAGATAAGA AAGTTAAGAA AGTATTTTTC TTACCTGAAG 4320 

GCAAAGAGGC AGTAGAATTG GCTTTTGAAC AAACGGATAG TCATGTTATC TTTACAGCAC 4380 

CTCACTTTAC TCATTATGCC TTTGTTTATG AATCTGCTGA AAAACCACAA CCTGCTAAAC 4440 

CAGCACCACA AAACACAGTC CTTCCAAAAC CTACTTATCA ACCGACTTCT GATCAACAAA 4500 

AGGCTCCTAA ATTGGAAGTT CAAGAGGAAA AGGTTGCCTT TCATCGTCAA GAGCATGAAA 4560 

ATACTGAGAT GCTAGTTGGG GAACAACGAG TCATCATACA GGGACGAGAT GGACTGTTAA 4620 

GACATGTCTT TGAAGTTGAT GAAAACGGTC AGCGTCGTCT TCGTTCAACA GAAGTCATCC 4680 

AAGAAGCGAT TCCAGAAATT GTTGAAATTG GAACAAAAGT AAAAACAGTA CCAGCAGTAG 4740 

TAGCTACACA GGAAAAACCA GCTCAAAATA CAGCAGTTAA ATCAGAAGAA GCAAGCAAAC 4800 

AATTGCCAAA TACAGGAACA GCTGATGCTA ATGAAGCCCT AATAGCAGGC TTAGCCAGCC 4860 

TTGGTCTTGC TAGTTTAGCC TTGACCTTGA GACGGAAAAG AGAAGATAAA GATTAAATAT 4920 

CGAAAAATCT TGTGAAATCT TTCCG 4945 
(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25002 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDMESS : double 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 



GACAACTCAA 


GTAGCTTTTT 


CTTATTTTGA 


AAAAGGAGAT 


CAGAGTTTAA 


CTATGTCAGA 


60 


AAAATCACAA 


TGGGGGTCGA 


AACTTGGTTT 


TATTCTAGCA 


TCTGCTGGCT 


GGCCATCGGG 


120 


CTTGGTTCCG 


TTTGGAAGTT 


TCCCTACATG 


ACTGCTGCTA 


ATGGCGGTGG 


AGGCTTTTTA 


180 


CTAATCTTTC 


TCATTTCCAC 


TATTTTAATC 


GGTTTCCCTC 


TCCTGCTGGC 


TGAGTTTGCC 


240 


CTTGGCCGTA 


GTGCTGGCGT 


TTCCGCTATC 


AAAACCTTTG 


GAAAACTGGG 


CAAGAATAAC 


300 


AAGTACAACT 


TTATCGGTTG 


GATTGGCGCC 


TTTGCCCTCT 


TTATCCTCTT 


ATCTTTTTAC 


360 


AGTGTTATCG 


GAGGATGGAT 


TCTAGTCTAT 


CTAGGTATTG 


AGTTTGGGAA 


ATTGTTCCAA 


420 


CTTGGTGGAA CGGGTGATTA TGCTCAGTTA TTTACTTCAA TCATTTCAAA TCCAGCCATT 


480 


GCCCTAGGAG 


CTCAAGCGGC 


CTTTATCCTA 


TTGAATATCT 


TCATTGTATC 


ACGTGGGGTT 


540 


CAAAAAGGGA 


TTGAAAGAGC 


TTCGAAAGTC 


ATGATGCCCC 


TGCTCTTTAT 


CGTCTTTGTT 


600 


TTTATCATCG 


GTCGCTCTCT 


CAGTTTGCCA 


AATGCCATGG 


AAGGGGTTCT 


TTACTTCCTC 


660 


AAACCAGACT 


TTTCAAAACT 


GACTAGCACT 


GGTCTCCTCT 


ATGCTCTGGG 


ACAATCTTTC 


720 


TTTGCCCTCT 


CACTAGGGGT 


TACAGTCATG 


TTGACCTATG 


CTTCTTACTT 


AGACAAGAAA 


780 


ACCAATCTAG 


TCCAGTCAGG 


AATCTCCATC 


GTAGCCATGA 


ATATCTCGAT 


ATCCATCATG 


840 


GCAGGTCTAG 


CCATTTTCCA 


AGCTCGATCC 


CCCTTCAATA 


TCCAGTCTGA 


AGGGGGACCC 


900 


AGCCTGCTCT 


TTATCGTCTT 


GCCTCAACTC 


TTTGACAAGA 


TGCCTTTTGG 


AACCATTTTC 


960 


TACGTCCTCT 


TCCTCTTGCT 


CTTCCTTTTT 


GCGACAGTCA 


CTTTTTCTGT 


CGTGATGCTG 


1020 


GAAATCAATG 


TAGACAATAT 


CACCAACCAG 


GATAACAGCA 


AACGTGCCAA 


ATGGAGTGTT 


1080 


ATTTTAGGAA 


TTTTGACCTT 


TGTCTTTGGC 


ATTCCTTCAG 


CCCTATCTTA 


CGGTGTCATG 


1140 


GCGGATGTTC 


ACATTTTTGG 


TAAGACCTTC 


TTTGACGCTA 


TGGACTTCTT 


GGTTTCCAAT 


1200 


CTCCTCATGC 


CATTTGGAGC 


TCTCTACCTT 


TCACTTTTTA 


CAGGCTATAT 


CTTTAAAAAG 


1260 


GCTCTTGCAA 


TGGAGGAACT 


CCATCTCGAT 


GAAAGAGCAT 


GGAAACAAGG 


ACTGTTCCAA 


1320 


GTCTGGCTCT 


TCCTTCTTCG 


TTTCTTCGTT 


TCGTCATTCC 


AATCATCATC 


ATTGTGGTCT 


1380 


TCATTGCCCA 


ATTTATGTAA 


TCAAAAAGGA 


CTTGAGTAGT 


GAACTCAGGC 


CCTTTCTTTT 


1440 


TATGGATGGC 


TAACAATCAA 


TTCCAAACCT 


TGCCCTTCCA 


GAGTCCAAGC 


TTCAACATCA 


1500 


CTTGGTAGGA 


TAAAGTGGCT 


GCCTTTTTGA 


ATTGGATAAT 


TTTTCCCGTC 


AACAGTTAGC 


1560 
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TGACCTTGAC 


CAGCCAAGAC 


ACTCAATAAG 


CTGTAGTCAG 


CTGTCTTTTC 


AAAGTCAACT 


1620 


TTTCCAGTAA 


TTTCCCACTT 


GTAAACTGCG 


AAGAAATCAT 


TAGATACAAG 


GAGAGTGGAA 


1680 


CGCAAATCAT 


CTGCTTTAAC 


AGTTACAGGA 


CGGCTATTTG 


CTGGCTCACC 


AATGTTCAAG 


1740 


ACATCGATGG 


ATTTTTCAAG 


ATGAAGTTCA 


CGCAAGTTGC 


CTTTGTCATC 


CTTGCGGTCA 


1800 


AAGTCATAGA 


CGCGATAGGT 


GGTATCGCTA 


GACTGCTGGG 


TTTCAAGGAT 


TAAGATACCC 


1860 


GCCCCGATAG 


CGTGCATAGT 


CCCGCTTGGT 


ACATAGAAGA 


AATCTCCAGC 


CTTAACAGGG 


1920 


ACTTTGGTCA 


ACAAGTCATC 


CCAGTTCTTG 


TCCTCGATTT 


GCTGGCGGAG 


TTCTTCTTTT 


1980 


GACTTGGCAT 


TGTGACCGTA 


GATAATCTCT 


GAACCTTCAT 


CCGCTGCGAT 


AATGTACCAG 


2040 


CATTCTGTTT 


TTCCGAGTTC 


GCCTTCATGC 


TCGAGTCCAT 


AAGCATCGTC 


TGGGTGAACT 


2100 


TGGACACTGA 


GCCAGTCGTT 


GGCATCGAGG 


ATCTTGGTCA 


AAAGTGGAAA 


TACAGGTTCT 


2160 


GGACGATTGC 


CAAATAATTC 


ACGGTGTTCC 


GCATACAAAG 


TAGCAAGATC 


TGTTCCCTCG 


2220 


TAACGACCAT 


TGGCAACTTT 


AGAGACTCCA 


TTTGGATGGG 


CTGAGATGGC 


CCAATATTCT 


2280 


CCGATTTTTT 


CACTTGGGAT 


GTCGTAGCCA 


AACTCATCAC 


GTAGCTTGGC 


TCCACCCCAG 


2340 


ATTTTTTCTT 


GCATAACTGA 


TTGTAAAAAT 


AATGGTTCTG 


ACATGTCGAT 


CTCCTGTCTG 


2400 


ATTTTTCTCC 


CCTCATTATA 


GCAAAAAAAG 


AGTTCGAATT 


GAACTCTTTT 


TTACATCTTA 


2460 


TAAAGCAGGG 


AGAAGATTTT 


ATAAAAATAG 


TAAACAAATG 


TGCTCTACCC 


GATGCTTGCA 


2520 


CCATTGCTAT 


AAATGACATC 


CTTGTACCAA 


TAGAAGGACT 


TCTTCTTGCT 


ACGTTTGAGA 


2580 


GCTCCGTTTC 


CTACATTATC 


TCGATCTACA 


TAGATAAAGC 


CATAGCGCTT 


ATTCATTTCC 


2640 


CCTGTGCCAG 


CTGAAACCGG 


ATCGATACAG 


CCCCAAGTCG 


TATAACCAAG 


CAAGTCAACC 


2700 


CCGTCTTGGT 


AAATGGCATC 


TCGCATGGCC 


TTGATGTGGG 


CCTCTAAGTA 


AGTAATCCGA 


2760 


TAGTCATCTG 


CTACATAACC 


ATTCTCATCC 


GGTGTATCCA 


TAGCACCGAG 


TCCATTTTCT 


2820 


ACGATAATAC 


TAAACTAAAA 


TCAAAAAGCA 


TTATATAATA 


GTGATATGAA 


ATCAACTAAA 


2880 


GAAGAAATCC 


AAACCATCAA 


AACACTTTTA 


AAAGACTCTC 


GTACAGCTAA 


ATATCATAAA 


2940 


CGCCTTCAAA TCGTTCTATA GTAAAATGAA ATAAGAACAG TACAAATCGA 


TCAGGACAGT 


3000 


CAAATCGATT 


TCTAACAATG 


TTTTAGAAGT 


AGGGGTGTAC 


TATTCTAGTT 


TCAATCTACT 


3060 


ATATTTCGTC 


TGATGGGCAA 


ATCTTATAAA 


GAGATTATAG 


AACTTTTATA 


GTAGTTTGAA 


3120 


ATAAGATGTG 


AACAACTCTA 


TCAGGAAAGT 


CAAATTAATT 


TATAGAAATA 


TTTTAGCAGC 


3180 


CAAGGTGTAC 


TGTTATAGAT 


TCAATACACT 


ATAGACTGTA 


ATCAAACAAC 


GATTTGGCGA 


3240 


AATGTAAAAA AATATGAGGA GTTCGGACTC 


GACTCTCTCC 


TTCAAGAAAC 


ACGTGGTGGT 


3300 


CGTAACCATG CATATATGAC AGTTGAGGAA GAGAAAGCCT TTCTTGCCCG CCATTTGAAG 


3360 



WO 98/18931 



PCTAUS97/19588 



433 



GCTACAGAGG CAGGAGAATT 


TGTTACAATT GATGCCTTAT 


TTCAGGCTTA 


TAAAAAGGAG 


3420 


TTAGGTCGTT CCTACACACG 


TGATGCCTTC TATCAACTGT 


TGAAGCGCCA 


TGGTTGGCGA 


3480 


AATATTACGC CACGTCCAGA 


ACATCCTAAG AAAGCAGACG 


CTCAAACCAT 


TGTTGCGTCT 


3540 


AAAAATAAAA TCTCAATCCA 


AGAAGGCAAG AAAGCGTTTT 


AAATATAGTA 


GACGTTTTCG 


3600 


TAAGGTTTGC TTGATGTACC 


AAGCTGAAGC TGGTTTCGGT 


AGAATCAGTA 


AACTGGGATC 


3660 


.TTGTTGGGCT CCAATAGGAG 


TAGGTCCACA TATCCATAGT 


CACTATATAC 


GAGAATTTCG 


3720 


CTATTGTTAT GGAGCTGTTG 


ATGCCTATAC AGGCGAATCA 


TTTTTCTTAA 


TAGCTGGTAG 


3780 


ATGTAATACT GAGTGGATGA 


ACGCCTTTTT AGAAGAGCTT 


TCACAAGCTT 


ATCCTTTTAC 


3840 


TCGTTATGGA CAATGCTATA TGGCATAAAT CAAGTACCTT AAAGATTCCG 


ACTAATATTG 


3900 


GTTTTGCATT T ATTCCTC C A 


TACACACCAG AGATGAACCC 


CATTGAACAA 


GTGTGGAAAG 


3960 


akjJK I JXlj l AA ACGTGGATTT 


AAGAATAAAG CCTTTCGAAT 


TTTGGAAGAT 


GTCATGAATC 


4020 


AACTCCAAGA TGTCATACAA 


GGATTGGAGA AGGAGGTGAT 


AAAGTCCATC 


GTTAATCGGA 


4080 


GATGGACTAG AATGCTTTTT 


GAAAGCAGAT GAGTATTATA 


TGCAATTTCT 


TTATATAAAA 


4140 


AGACCGGATT GCTCCGATCT 


TTCAATAGTT CATATTCTCA 


ATTTCTATTT 


TAAAAATAGC 


4200 


TAAGGTTAAC GTCAAATGAC 


TACGCGACCT ATTTCATACG 


AT AAAAAT CA 


AGCACTAGAC 


4260 


CAGCAGGTCC TTGAACTAAT 


AAGGACTCTG TTCCCCAATC 


GGTTACAGTT 


GGTCCGTGTA 


4320 


AAACCTTTAT ACCAAGCTCG 


TTCAACCGTT TGTAGTTCTG 


GTCTACATCC 


TCAACCTCGA 


4380 


TATGAATAAT GATTCCTGAC 


TGAAAGTTTT CCAAAGGAAC 


CAAATGATTT 


TGTGACAACA 


4440 


TAAGGCAGTG ACTACCAATC 


GTAAACTGAG CAAAACCATC 


ATTAGCATAA 


TCTGCCTTTT 


4500 


TATCCAAGAT ATGCTCCAAG 


TCAGCACAGA CTTGGGGAAC 


ATTTGAAACG 


ATAATATCTA 


4560 


ATTGATTTAA ATTCATTTAC 


TCTCCTCCAT AAAAAGACCG 


GATTGCTCCG 


ATCTTTTAAA 


4620 


GTTCTGCTCT ATGAAAATCA 


AAGAATAAAG TCTACAAGTT 


TCATATTTGA 


TTTTCGGCGA 


4680 


GAGGAATTAT TTAATTGCGC 


GTGATTGCAA TCCTTCTTCT 


TCCAAGAAGA 


GACGGAATGG 


4740 


TACGAGTTCT TCTGCTTCGT 


ATTTTTCCTT GAAGGCTTTG 


ATAGCTTCTT 


CTGAGTGAAG 


4800 


TTTTGGATCC AATTCAAGTA 


CTTCTACTGG AAGTGGACGG 


TGTTGAGTGA 


TGCGAGCATC 


4860 


GATGACAACA GTTTTACCTT 


CTTTGTTCAA TTTAACAGCT 


TCTGCAACAA 


CTGCATCGAT 


4920 


GTCTTCGATA CGGTCAACTG 


TGAATCCAAC AGCTCCTTGA 


GCTTCCGCAA 


TTTTAGCGTA 


4980 


GTCAGCGTTT GTGAAGTCTA CACCAAACAA GTGTTTGTTT 


GTATCTTCGT 


ATTTGTTCTT 


5040 


GATGAAGCOG TACTCAGCAT 


TTGAGAAGAC AAGGTTGATA 


ACTGGAAGGT 


CGTATTGAAC 


5100 
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GTTTGTGATA ACGTCTGGGT AGCACATGTT GAATGCTCCG TCACCCATGA TGTTCCATAC 5160 

TTGGCGATCT GGATTGTCTT TCTTAGCAGC GATACCACCA GGAAGGGCAA TACCCATTGT 5220 

CGCAAAGAGT GGAGATGTAC GCCACATGTT CTTAGGTGTC ATGTGAAGGT GACGAGTAGA 5280 

TGTTTGAGTA GTGTTACCTA CGTCGATTGA GTAGATAGCG TCTTGATCAG CATGTTTGTT 5340 

GATTGCATTG TAAACTTGAT ACAATTGCAA TTCACCCTCA GTTTTACCTT CGAGTTTGTT 5400 

CATGTAATCA CGCCAGTTTT GGTTGTTCTT AACGTTTGCA CGCCACCATG GAGTTGATTC 5460 

AACTGGGTTT ACTTTGTCAA GGATAGCTTT AGCTGCTTGA CCAGCATCAC CAAGGATTGA 5520 

AGCGTCAAGG GCATGACGTT TACCAAGTTT GTAAGGGTCG ATATCGACTT GGATGAATTT 5580 

TTCAGTGTTC TTGAATGCTT CGTAAACTTC AGCAAATGGG AAGTTTGAAC CAAGGAAAAG 5640 

AACTGTGTCT GCTTCAAAGA CCACTTCGTT GGCTGGTTTC CAACCAACAC GGTAAGCAGA 5700 

ACCTGTCAAA CCTTCATAGT TCCATTCGAA AGCTTCAAAG TTTTTACCAG TTGTGATGAT 5760 

TGGTGCTTTG ATTTTACGTG ACAATTCAGT AATCACTTCA CCAGCTTTAA CACCACCAAA 5820 

TCCAGCATAG ATAACTGGGC GTTCAGCATT GTTCAAGATT TCAACAGCTT TGTCGATTTC 5880 

AACTTCGTTC AAAGCAGGAG CGATGAATGA GCGTTCGTAT GAACCTGAAC CGTAGTATGA * 5940 

GTTTTCATCG ATTTCTTGGA AACCGAAGTT TACTGGAATT TCAACAACAG CTGGACCTTT 6000 

TTTAGAAACT GCAGCACGGC AGGCTTCGTC AATTACTTTT GGCAATTGCT CAGCGTAAGC 6060 

TACACGTTTG TTGTAAACAG CGATACCGTT GTACATTGGG TTTTGGTTAA GCTCTTGGAA 6120 

AGCATCCATG TTCAATTCGT TAACTGGACG TGATCCAAGG ATCGCTAGGA ATGGAGTGTT 6180 

ATCCATAGCT GCATCGTAAA CACCGTTAAT CAAGTGAGTC GCACCTGGAC CACCTGAACC 6240 

AACTGCAACC CCGATTGAGC CGCCGAATTT AGCTTGCATA ACCGCTGCAA GAGCACCTGT 6300 

CTCTTCGTGG CGAACTTGTA AGAAACGGAT ATCTTTGTCT TCAGCCAAAG CGTCCATCAA 6360 

TGAGCTGAGT GTTCCTGATG GGATACCGTA GATTGTATCT ACGCCCCATG TTTTCAATAC 6420 

GTTAAGCATT GCTGCAGATG CAGTAATTTT CCCTTGAGTC ATAATGATAA CTCTCCTTCA 6480 

ATTTTTTTAA ACTTGGAGAA TACGATTACA TAGAATTGGA AACGTTCTCC AAATTTTTAC 6540 

TATTCCACTG TATCATATTT ATGCTGACTT TTCTAAAAAT CTGCTCAAAA CTCTCTATTC 6600 

TCTATTCTAA TACAGTTTTG AAAGTTCTGT CATTTCTGTT TTATAACAAA GAAATCTAGT 6660 

CATTACTTTT AGTCTATTTT ACTAAAATTT AACAGAAGGG AACTGGTCAG AACAGATACA 6720 

GAACTAAAGG CCATGGCTAG ACCTGCCAAT TCTGGGTTGA GAGCCAGTCC AACACCTGAA 6780 

AAGACTCCTG CTGCAATCGG AATTCCGACA ACATTGTAGA TAAAAGCCCA GAAAAGATTG 6840 

AGTAGAATTC GATGAAAGGT TTTCTTACTC ATATCAAAGG CACGAACCAC TCCTAAAAGA 6900 
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TTATTGGTTG 


TCAACACCAA 


ATCTGCTGAC 


TCGATGGCGA 


TATCTGTTCC 


AGCTCCCATA 


6960 


GCAATCCCCA 


CATCTGCTAC 


ACTAAGGGCA 


GGAGCGTCAT 


TGATACCGTC 


CCCAACAAAG 


7020 


GCTACTTTCC 


CTGACTGTTG 


CAGTTTATGG 


ATTTCATGGG 


CTTTTTCTTC 


TGGCAAGACG 


7080 


CCTGCAATGA 


CCTCTTCAAT 


TCCGATTTGA 


TCTGCAATAG 


CACGCGCCAC 


ACCAGCATTG 


7140 


TCTCCTGTCA 


GCATGACTGT 


TCGGAGACCA 


CGTTTTTTTA 


GCTGACTGAT 


GGCTAGCTTA 


7200 


GCATTTTCCT 


TAGGAATATC 


TTGCAAAGCA 


AGCAAGCCTT 


TGATTTCATT 


GTCAACAGCT 


7260 


AAGAACACAA 


CTGTCTTAGC 


TTCTTTTTCT 


AGTTCTTCTA 


GTTTATCTTG 


ATAAGTATTA 


7320 


GAAATATCCA 


TGCCATCCAG 


CATTTTAGCA 


TTTCCAAGTA 


AAACTTGTTT 


TCCATTGATT 


7380 


CGCCCTGAAA 


CACCTTTCCC 


GTGCAAGGAC 


TGAAAATTTT 


CAACAGTTTG 


AAACTCAAGT 


7440 


CCAGCTTCAC 


TCGCTCGCTT 


AACGATAGCC 


TCAGCCAGTG 


GGTGTTGAGA 


AGCATCTTCC 


7500 


AAGGAGGCTG 


CCAACCCAAA CACTTCTACT 


TCGTCGCCGA 


TGACATCTGT 


TACCACAGGT 


7560 


TTCCCTTCCG 


TCAAAGTCCC 


GGTCTTATCA 


AAGACAAGGG 


TTTGAACTTT 


CTGGATTTCC 


7620 


TGTAAGACAG 


TTCCATTTTT 


GAGGAGAACC 


CCCATCTTGG 


CACTACGTCC 


TGTCCCCACC 


7680 


ATAAGGGCTG 


TCGGTGTTGC 


AAGTCCCAAG 


GCACAAGGAC 


AGGCGATAAT 


CAAAACCGCC 


7740 


ACTCCGTAGA 


GAAGAGAGGA 


CACAAAGCTA 


GCTCCAAGCA 


CAACCACACT 


ATCCCTGAGC 


7800 


AAGACGAACC 


AAACCCAAAA 


GGTCATGATT 


CCTAAAATGA 


CAACTACTGG 


GACAAAAATC 


7860 


CCTGAAATCT 


TATCCGTCAA 


GTCCTGAATC 


GGCGCACGAC 


TTGTCTGAGC 


TTTCTTCACA 


7920 


AAATCCACAA 


TCTGAGCCAA 


AACAGTCTCT 


GAGCCAACTT 


TTTCTGCTCT 


AAAGACAAGC 


7980 


GTTCCACTAT 


GATTGATGGT 


TGAGCCAATG 


ACAGTATCTC 


CAACTGTCTT 


GTCCACAGGC 


8040 


AGACTCTCAC 


CTGTCACCAT 


GGATTCGTCA 


ATACTAGAGA 


CACCTTCTAC 


TACGACACCA 


8100 


TCAACAGCAA 


TCTTTTCACC 


GGGACGCACT 


CGAATCAGGT 


CGCCTACCTT 


GACTTGTTCC 


8160 


AAAGGAACTT 


GGACATAACT 


ATCATCACTC 


AAGACTTCTG 


CGGTTTTAGC 


TTGCAAGTCC 


8220 


AGTAATTTCT 


CCACAGCTTG 


GGACGTATTT 


TTTCTCATTT 


TTTCCTCAAA 


AACTGCTCCC 


8280 


AAAAGAACGA 


AAAAGAGGAT 


AAATCCAGCA 


CTTTCGAAGT 


AAACAGGGAG 


ACCAGCAAAG 


8340 


AGAGCAACTA 


GGCTATAGAA ATAAGCCACT 


AGAGTTCCCA 


GCGCAACCAA 


GGTATCCATG 


8400 


TTGGCATTGT 


GCTTTTTAAA 


ACTGGCCCAA 


GCACTCTGGA 


TATATGGCTT 


ACCTGCAACT 


8460 


AACATAATAG 


GCGTTGTTGC 


TAGAAAGGTT 


CCCCAATGCA 


TGACTTGATG 


ACTAATGCTA 


8520 


CCTGTCAACA 


TCCCAATCAT 


GAGAATCACA 


AGAGGCACAG 


TAAAGATACT 


AGTAATCCAA 


8580 


AAACGTTGCA 


GGAGAGATAG 


AGATTTTCGA 


GTCTTCTCAA 


CGACTGTATA 


GCTTCCCTTT 


8640 
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TGCATCTTCA 


TGCCACAAGA 


AAATTCATGT 


CGCCCTAATT 


CTTGAGGCGT 


AAAACGAATG 


8700 


ACTTTCTCCT 


CATCTACGCC 


GATTGGTTCC 


AAGATACCTT 


CTTCTTCAAA 


CAGAATTTCC 


8760 


TTATAACAGT TTGAAGGAGT AGCACGATGA AAGGTAATCT CAGCTGGAAT 


TCCCTTTTGA 


8820 


AGCTGGATAT 


GGGCTGGATG 


ATAGCCTTTT 


TCAGCTCGGA 


TACGGATTTT 


TTGAATGCCA 


8880 


TTTTCTAAGC 


TTGCTTTCAC 


AATTTCTGTC 


ATAGTCTCCA 


CCTACTCTAC 


AATCATCTTG 


8940 


CCGTGCATCA 


TGTTCATACC 


ACAAGCAAAG 


CCAAACTCTC 


CAGCCTGTTC 


AGGCGTGATT 


9000 


TCCACTACAT 


ACTCTTCCCC 


CATTGGCAGG 


TTCGCATGTA 


CACCAAAATC 


TGGAAAAACA 


9060 


ATTTGATCCA 


GACATGGTGA 


AGGATCCTTG 


CGGTCAAAGA 


CAATGCGTGC 


TGGCACTGAT 


9120 


TTCTTGAGGA 


CAATCAACTC 


AGGAGTATAG CCTCCCATGA 


CTTCCACTCG 


AATCTCTTGG 


9180 


TATCCGTTTT 


TTTGCTGGGC 


TTTTTGTCCA 


GATTTTTCAG 


GCTTTTTGAA 


AAACCAAAAC 


9240 


AAGATAAACG 


CGATAAGGGC 


AATACAAATA 


ATGGTTACAA 


TACTATTTAA 


CATGACGTCT 


9300 


CCTTTACATA 


CAATTACATC 


TTACTTCTGT 


TACAGCACTT 


GATTTCTTCT 


CTGAAATCAC 


9360 


AGCTTCCAAG 


TCTTCCAAGT 


CAGTCTGAGT 


AAATTCACAT 


TCTACAATCA 


AGTCAGCCAA 


9420 


CAAATTCCTA 


ATCCTACGGG 


AACAAACCTT 


GTCTTTGATA 


TCTTGGACAA 


GTAAATCCCG 


9480 


ACTTTGGTCT 


AGAGTTAAAA 


GGGCTGAATA 


AACAAAGGAC 


TTGCCTTCTT 


TTTTCCGAGT 


9540 


CAAACACTCT 


TTATCAACCA 


GACGAGCCAA 


AAGTGTCTGA 


ACCGTGGACT 


TGGACCAGTC 


9600 


AAACCGCTCT 


GCCAAAACCC 


TAATCAAATC 


TGTACTGGTC 


TGCTCCCCCT 


GCATCCAAAT 


9660 


AATCTTCATG 


ACCTGCCATT 


CTGCATCTGA 


AATCTGCATT 


ACCATACCTC 


CAAAATCTAC 


9720 


ATTTGTCAAT 


TACACTCATC 


AGTATACTCT 


TAAAATCTAC 


ATTTGTCAAT 


TATAGAAATA 


9780 


ATATTTTCTT 


CGAAAAATAG 


AATTTTAATC 


ATTTGAAAAA 


CGATTTGCAG 


TCAAATATTA 


9840 


CTATATAAAC 


AATAAAAATA 


TGCTATACTA AAGAAAAAAG 


AAAACAACCA 


CTAGGGGTGC 


9900 


GTAAAGCTGA 


GATTAACGAC 


TGTTAGATCC 


CTCTGACTCA 


ATCTAGGTAA 


TGCTAGCTGA 


9960 


TGGAAGTGGA 


AATGATAATG 


GGGACTAGCA 


GTCTTCTATT 


GCCTTTCTAA 


AACAGACTAG 


10020 


CTTGTTCTTA 


AGAATACAAA 


CTTCAGTTGG 


TTGGGAGGTT 


TTAGATGACT 


TATTTACCCG 


10080 


TTGCTTTGAC 


CATTGCAGGG 


ACTGACCCTA 


GTGGTGGTGC 


TGGCATTATG 


GCAGATTTAA 


10140 


AGTCATTCCA 


AGCGAGAGAT 


GTCTATGGAA TGGCTGTTGT 


AACCAGTCTT 


GTCGCTCAAA 


10200 


ATACCAGAGG 


TGTTCAGCTA 


ATCGAGCACG 


TTTCTCCTCA 


AATGTTGAAA 


GCCCAATTGG 


10260 


AGAGTGTCTT 


TTCTGATATT 


CCACCTCAGG 


CTGTAAAAAC 


TGGAATGTTG 


GCTACTACTG 


10320 


AAATCATGGA 


AATCATCCAA 


CCCTATCTTA 


AAAAACTGGA 


TTGTCCCTAT 


GTCCTTGATC 


10380 


CTGTTATGGT 


TGCTACAAGT 


GGAGATGCCT 


TGATTGACTC 


AAATGCTAGA 


GACTATCTCA 


10440 
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AAACAAACTT 


ACTACCTCTA 


GCAACTATTA 


TTACGCCAAA 


TCTTCCTGAA 


GCAGAAGAGA 


10500 


TTGTTGGTTT 


TTCAATCCAT 


GACCCCGAAG 


ACATGCAGCG 


TGCTGGTCGC 


CTGATTTTAA 


10560 


AAGAATTTGG 


TCCTCAGTCT 


GTGGTTATCA 


AAGGCGGACA 


TCTCAAAGGT 


GGTGCTAAAG 


10620 


ATTTCCTCTT 


TACCAAGAAT 


GAACAATTTG 


TCTGGGAAAG 


CCCACGAATT 


CAAACCTGTC 


10680 


ACACCCATGG 


TACTGGATGT 


ACCTTTGCTG 


CAGTGATTAC 


TGCTGAACTA 


GCCAAGGGCA 


10740 


AGAGTCTTTA 


CCAGGCAGTT 


GATAAGGCCA 


AGGCCTTTAT 


CACAAAAGCT 


ATTCAAGATG 


10800 


CCCCTCAACT 


CGGTCATGGT 


TCTGGTCCAG 


TCAACCATAC 


AACTTTTAAA 


GATTAAGAAA 


10860 


AAAAACTCTC TAGTTCCCAC TTTAAGGGAA TTAGAGAGTT TTTATACTCT TCGAAAATCT 


10920 


CTTCAAACTA 


CGTCAGCTTC 


CATCTGCAGC 


CTCAAAACAC 


mi hiwiviuti/* n /"> 

TGTTTTGAOC 


TOAt. TTLOTL 


10980 


AGTCTTATCT 


AAAACCTCAA 


GGCAGTACTT 


TGAGCAACCT 


GCGACTAGCrl 


Ti C rAGTTTA 


11040 


CTCTTTGATT 


TTCATTGAGT 


ATTAATTAGG 


AAAGAATGTT 


Al\iCAA(_ lli 




11100 


CTTGCGTTTT 


TGCCTCAATA 


TCTTCTGCTT 


GCA 1 1_ AAA I L. 


At. Is 1 ALAALH 




11160 


CTATGCCAGT 


GCCCATAAGC 


TGATCAATAT 


TCTCCGAAGT 


CAAGCCTCCA 


ATAGCAACTA 


11220 


CTGGAATGGC 


AACCGTTTGG 


CAAATTGTTT 


TCAAGGTCGA 


TATCAGAGTA 


ATGGGCGCAT 


11280 


TTTCCTTGGT 


GGTGGTTGGG 


AAAATGGCTC 


CTGTACCCAA 


GTAATCTGCA 


CCTGATTTCT 


11340 


CCGCTTCCAG 


AGCTCTTTTA 


ACCGTTTTAG 


CGGTGACACC 


GAGGATTTTT 


TCAGGACCCA 


11400 


AGACTTTGCG 


AGCTACCGAA ACTGGTAATT 


CATCATCTCC 


GATATGCAGA 


CCTGCTGCAT 


11460 


CAACCGCAAG 


ACAAACATCC 


AACCGATCAT 


CGATTATCAA 


GGGTACCTGA 


TAAGCATCTG 


11520 


TTATTTCCTT 


GACTTGTTTT 


GCCAGTTGAT 


AATATTGATT 


GGTTGTGAGA 


TTTTTTTCTC 


11580 


GCAATTGGAC 


TATGGTAACC 


CCTGAACGGC 


AGGCCGTCTC 


AACTTTTGCA 


AGAAAGCTTT 


11640 


CCACGGAATC 


TTGATAGCGA 


TTGGTTACCA 


GATATAGTCT 


AAGTGCTTCT 


CTATTCATAA 


11700 


ACCTCTCCTT 


TGATGGTATC 


TAGCCAATTT 


TCATCTCTTC 


TTAGGAGCGA 


AAGCTGATTG 


11760 


AGTACTTGGT 


AACGAAATTC 


TTCCAATCCC 


ATTCCTTGAA 


CAACTATTTT 


CTCAGCAGCG 


11820 


ATATTGAGAT 


AAGAGACTGC 


TAAGCAAGAA 


GCTTCAAAAC 


CAGTCTTTCC 


TTGGCTGAGA 


11880 


AAAACAGCTG 


TTAAGGCTCC 


AACCAAGTCT 


CCTGTCCCTG 


TTATCCAGTC 


TAATTCAGTA 


11940 


CAGCCATTTC 


CCAGTACAGC 


GACCTGATTT 


TTCGAAACGA 


CGAGGTCCTT 


GGGACCTGTG . 


12000 


ACTAAGAAAG 


ACATACCAGG 


ATAGGTCTGA 


CACCAGTCTT 


TCAAGACTTG 


AAGCAAATCC 


12060 


TCCGTTTCTT 


GATCTTTAGC 


ACTCGCATCG 


ACCCCAACGC 


CGTGGTGCTT 


TAATCCAACA 


12120 


AGACTTCGAA TTTCTGACAT GTTTCCTTTA AGGACCGTAG GTCTATAGTC TAAAAGGTCT 


12180 
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TTAACTAAGC TCTTACGAAT GGATGAAGTC GTTACGCCAA CCGCATCTAC TACCATCGGG 12240 

AGAGAAGATT GGTTTGCATA CGAAGCTGCC ATGCGGATTG CTTTTTCCTT CTCAGCTGAC 12300 

AAATGCCCCA AATTGATGAA GAGAGCCTGA CTTTGCTTAG TAAAATCAAG AACTTCACGG 12360 

GAATCATCTG CCATGACAGG TTTGCATCCC AGAGCCAAAA TCCCATTTGC CAGCATCTCA 12420 

CAAGAAATCT CATTGGTAAT GCAGTGAATG AGGGAACTAG AGCCTATAGG AAAGGGATTT 12480 

GTAAATTCCT GCATCAGTCT ATCCTTTCAC TAAAGAAATA TCCCTGCACT TTTTTAAAGA 12540 

ATTCCTGCTT GATTAAAAAT CGAAAGGCAA TAAAGGAAAT CGCTGTACCA ATCAAGGTTG 12600 

CTCCGAAAAA TCGAGGCGTG TAGATAAACC AGCTAAGCTT AG C AGCTGAT CCTGTAAAGA 12 660 

GTACCATAAC AGGATAGGAA ACAATGGAAC CAATAATACC TGTTCCCAAA ATCTCTCCTA 12720 

GAGCAGAATA GTGAAATTTT CGACCGTACT TATAAAAGAG ACCTGCTAGA AGGGCTCCAA 12780 

AAGTCGCTCC TGTGAGAGCT AAAGGCGGAA TCCCTTGAGT CGTCATACGG ATAAAGGCTG 12840 

TGACTGTAGC CATAGCCAAG GCATAAACAG GTCCCATCAT GATTCCTGCT AGAATATTGA 12900 

CTACACTGGA CATCGGTGCC ATTCCCTCAA TTCGAAAGAT AGGTGTAAGG ACTACATCAA 129 60 

GGGCAATCAT CATAGATAAA ATGGTTAATT TGTGAACTTG TAATTGGTGC TTTCTCATGC 13020 

TTCTATTCTT CTCCTTTTTC TAAAGACTGT AAATCGCTCT TCCATGTCTG GTGTTGGTAG 13080 

GCCATTTCCC AAAACTTGGC TTCCATATGA ACACTGATGT GGAAGGCATC TAGCATTTTT 13140 

TGCTTGTCTG TCTCGTCACT TTCTCGATAG AGCTGATTGA CCAGTGCTCC CTCCTCTCTG 13200 

ATCTGTTGCT CTAACTCATC CGTAATATAA GTTTCAATCC ATTGTTGATA GAGAGGATTT 132 60 

GGTGATGGTT TAAGATTAAG TGATTTGCCT ATATCATGGT ATAACCAAGG ACAAGGAAGC 13320 

AAGCTTGCAA AAGCGATGGC TAAGTTCGGT TCTGCAAATT GCCTATAAAT ATGAGAAATG 13380 

TAATGATAAC AGGTTGGAGC GATTGGATGT TGCTCCATTT CCTGGTCGCT GATTTCCAAT 13440 

TCCTTGAAAA ATTGTTGGCG AATAAATAAC TCACCCTCCA CTAAACCCTG AGCATTTTGT 13500 

TTCAAGAGTC TTTTCATCTC TTGGTTTGAA GTCTTATCAG CCAAAAGATG ATAGATTTCT 13560 

GAGAAAGCCT TCAGATAGTA GGCATCCTGA ATCAGGTAAT AGCGGAAAAT GGCAGGTTCT 13620 

AAATTCCCCT CTTGTAATTG TAAAATAAAG GGATGATGAA AGGAAGCCTG CCAAGCTTTC 13 680 

TTGGATAATT CCATCGCAAT ATCTGTAAAT TCCATAATAA CTCCTTTATA AAAATAGACT 13740 

GGTTTGAAGC AATAAAAAGA AAAGCAGGTA GATTAATTTT GTTTTTTTAG GAATATAAAA 13800 

AGTCCGATAG CTATTCTTCA ACTGTGCATG TTCGTCATAT CCGTGAGCAG ATAGAGCTCT 13860 

CAGGTAAAGA TGGCGCCACC TAAAGACTGT CATCAGAACC TTACTGTAAA TCAAGGGCGA 13920 

CCAAAAATGT AGTTCTTGAC CACGTAATAG GCAAGCTTCT TTGAGGGACT TGATTTCTTG 13980 
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CTGAATGAGA GGAAAAGAAT TGAATACCAC AATCAAGGCA TAGGACCAAG AGCGTGATAG 14040 

CCCCTTTTGA GCCAAGTACA AGAGAAGCTC TTTTAGTGAA ACAGAGGAAA CAAAGACAAG 14100 

GCCGATACAA ACTGTCACAA AGGCCCTCGT TCCAAGCATG ACTGCCTGTG AAGCATCTCC 14160 

GTGTAACTGA ACTGCCCAGT AGTTGGCAAA AGATGGTAAA ATGGCAAGTA TGATCATCCA 14220 

AGCTAACATT TTAAATCGAC GGTAATAGAG CATAAAGAGA ATACAAAATG CGACTACCGA 14280 

AAGAGTCAGA GCAATCGAAG GAATGAAAGA TGTTTCCAAG GATAAAATCA GCAAGAAGAG 14340 

ACTGATAATC GGTGTCTGGG TTGCTACTTT GACCATACTA TCTCACCTCC CCTTGGGTAT 144 00 

TGCTACTCTG AGATGTAAGT GGTTTGGTAA TGGTCACTTC TTTCACATGC CGAAGACCCT 144 60 

GACTAGTCAT CTCAATCCAA TAATCAACCA CAGAAATCAA AGGGTCTAAA CGATGACTAA 14520 

TGAGCAGAAA ACTTCTTCCT TGATTCCTCT CCTCCACAAT CCACTTGCAA AAATAATGGC 14580 

AGGCTCTATC ATCCAAACCT GCAAAAGGTT CATCTAGCAA GATCACGGAA GCCTTACTGG 14 640 

TCAAGATGGT CAGGAGCTGA AGAATTTTTT GCTGACCACC ACTTAATTGA TAGGGACTCT 14700 

TATCGACTGC CTGCTCCAAA TCAAAATATC GTAAAGCTTG AAAAATCCGC TGATTTCTTT 147 60 

CAGAATCAGG TCCATCTAAT TGAAGCTCCT CTCGCAGACT GACTCGGATA AACTGCTTCT 14820 

CAGCTTCCTG AACAACACCA GTCAGATCAC GATACAAACT CTTTTTCTTT TTCAGGACCG 14880 

AACCCTTCCA AGTAATGCTC CCCTTATACT TTTGAAATTG AAGAATAGAC CGAAAGAGGG 14940 

TTGATTTCCC GACACCATTG TCACCCAGGA TACAGGAAAT CCCTTGATAG AATGTGAAAT 15000 

CAGCAATTGA AAAGAGGGGG CGATTACCAA GCTCACCAGT CACACGGTTC ATATGGAATA 15060 

GTTCCGGGCT AGAAGCAACT TCCTTTGAAG CAACCTGTGT CATCTCATAG GAAGGGATTT 15120 

GAAACACTTC CCTTAGTTTT CCGTCTCTTA GCTCCACCAT ATGGTCGATA TAGGCTTTAT 15180 

AGTCAGATAA ATCATGGTCG CACAAAATAA CTGTCTTCCC ATCATAGACC AACTCTTTTA 15240 

GAATCTCCAA TATCTCGATT CTGCTCTTGC GGTCAATGGA AGCGAAGGGC TCATCCAAGA 15300 

GATAGACCCT AGGATTCATG GCAAAGAGGA CAGCCAGCGC TGCTTTTTGC TTTTCCCCAC 15360 

CTGATAAGTG ATGGATGAGA CGGTGCAAGA TGTCCTTGCA ACGACATTGC TGGACAACCT 15420 

CTGCT ATTTT AGAATCAATT TCCTGAAGGT GATAGCCGAT ATTTTCCATG GTAAAAACCA 15480 

ACTCCTCAAA CAAGCTCTCC ATGGTAAATT GATGATTAGG ATTTTGCAAG AGAATACCAA 15540 

CCGTCTGGAC ACGTTCGACG ATAGAAAGCT GACTGACCTC GCTCCCATCT ATCAGGACTT 15600 

GACCGCTATA GGGAAGAGAA CTAACTTGGG CAATCATTTG AAAGAGGCTG GATTTTCCAG 15660 

ACCCACTACT CCCAACTAAC AAGGTAAAGG CTTGCGCATG AAAAGTAAAA TCAAACGGCT 15720 
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CAGAGAAGAT TGGGGACTGA ATCGCTCGTA GTTCCAGACC CATCTATGCT TTTCCTCCAG 15780 

TTGCAAACTG ATGATAGAGT TTGACAATGG CACGAACCAA GATGGTACAG AAGAAATAAA 15840 

CAGAAATAAA ACGTACCACA AGCAAGGAAA GGACAAACGG AAGGGAAAAG GCGTAGTAAC 15900 

CTAACTTAAT GTATTCATAG ACAAAGCTAA CAAGCGTAAT CCCAATACTA TTAGCAGTTA 15960 

GAGAGAGCCA ACTTTCATAG CGATTCTTAG TTACGATAAA ACCAAATTCA CTTCCCAAAC 16020 

CTTGAAC AAA GCCAGACAAA AGAGCTCCTA GACCAAATTG GCTACCATAA AGGACTTCAG 16080 

CAAGCGCAGC TAGCACTTCT CCAATCGTTG CACTTCCGAC TCTCGGAACA AAGATGGCAG 16140 

CAATGGGCGC AGCCATACAC CAGAGACCGA AGAGGATTTC ATTGGCAAAG GCCTGCAAAC 16200 

CAAGAGGTGT TAAGAGTAGA CTGAGAATAT TATACACATA TCCTGAACCA ACGAAAACCC 16260 

CACCAAAAAA GATAGACAAG AAAGCAAGCA AGATAACATC TTTTAACTGC CATTTTTTCA 16320 

ACATAAAAAA CTCCTTTTTT TAAAGAAAAG TGAGGCACTC AAGAAGACCG ACCTAAATAC 16380 

TTTGTATAGC AGACTGAATT TAGAACAGTA CACAAGAACA CTAAAATATT TCTAGAAATT 16440 

AATTTGAATT TTCTAATTGA TTTGTTCGCA TCTTATTTCA ATCTACTATA TCATCTTCAT 16500 

CCAGTTTCGT AAAAGAAAAA ACTCTAATTA CAGATACAAA TTAGAGTTCA GCTTACAAGA 16560 

TTAGACAGTT CTTTTCGACA TACGAAAAAA ACATTTCACA TTTCCCTTCG CCAGTCTTAA 16620 

CTGTATCAGG TTCAATGGGT ATCATCTCAG CCTAAAGCAC CCCAAATGTC TTTATTATTT 16680 

AATTATGTGA TTATTATAAC ACACATTTTA TACTAGTTCA AGAAATTGAA CTGGAAATAC 16740 

AGCCTTGCAC TCACAAAGAC AGCAGATCTT TCTTTTGCAA AAAACAAATG ACCTGTTTGA 16800 

TGAATTAGCC ATTCAAGCTG AATCTGGACA TAGCTTTTTA AAAAAGGAAA ATCCTACTTA 16860 

CTTAGAATCC AAGGATAGAT ATCTATTGTT CACTCATTTC CCGAACAGTT TTTTCTATAT 16920 

TTTTTGCATA CGATATTGCC GAAATGATTG AAACGCCATC CATATTGGTC TTTATAATGT 16980 

CTTTAATATG TTTCGTCTGT ATCCCACCAA TTGCAACTAA AGGCATTTGT GGCAATAGTT 17040 

TTCTCATCAA TTCAAGACCT TCATAACCTA TAGTACCACC AGCATCATCC TTTGACTGGG 17100 

TACCAAATAC AGGCCCAACA CCTACATAAT CTACATATTC AACTTTTGAT TGTTGAAATT 17160 

CTTCTTCGTT TCTTATAGAA AGACCAATTA TTTTATCTGG CATCAATTTT CTAATTTCAT 17220 

CAACACCAAT ATCATCTTGA CCTACATGTA CGCCATCGGC GTCAATTTCC ATTGCTAAAT 17280 

CTATATCGTC ATTAACGATA AATGGAACAT TGTATTTTTT ACAAAGTTCT TTAATTTGGA 17340 

TAGCTAGCTC AAGTTTTTCT AAGCCTTCTA AAGCACCCTC ACCTTTTTCT CGAAATTGAA 17400 

ATAAGGTTAT ACCACCTTTT AAGGCTTCCT CAACGACTGT ATATAGATTT TTTCCTTGGC 17460 

AAGTAGTCGT TCCACAAATA AAATATAGTT TTAGTAATTC TTTATGAAAC ATCTTACTTC 17520 
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ACTCTTTTGA 


ATTCCTTTAC 


ATCTTCATCT 


GTAATCTCGT 


ATAAGGCATT 


TATAAATTCA 


17580 


ACTTTAAATG 


TCCCAGGAAG 


ATGTCCATTT 


GGACGTTTTT 


CTGCTATTTC 


TCCAGCGATA 


17640 


TTGTAAACCA 


ACACTGCTGT 


TTTTAATGAT 


TTCAATTCTT 


GACCTTTTTC 


TAGTCCGATA 


17700 


AAGCTTGCTA 


CTACAGCTCC 


TAATAAGCAT 


CCTGTCCCAA 


TGACTTTCGG 


CATCATAGCA 


17760 


CT AC C ATT AT 


GAATCATTAC 


CACTTCTCCA 


TTAACAGCAA 


TGGCATCCAC 


TTCACCTGTT 


17820 


ACTACTATTG 


GAATATTGAA 


CTTCTCATTT 


GCTGCTAGAG 


CAATTTCGTC 


AATATTATCT 


17880 


ACGCCCGCAC 


TATCTACTCC 


TTTAGATGCC 


ACATCTATTC 


CTACTAAAGA 


GGCAATCTCG 


17940 


CCAGCATTTC 


CTCTAATCGC 


TGCTAGTTTA 


TAATTGTTGA 


TTAGATCATC 


TGCTACTTTT 


18000 


TTTCTATATT 


CTCCTGCTCC 


ACAGGCTACA GGATCTAAAA CTGCTGGGAC 


ATTATATTTC 


18060 


TCTGCAATTT 


TCAGAGCAGC 


TTGGTATAAT 


TTCCAATTTT 


CATCTGTCAA 


TGTTCCTATG 


18120 


TTTATTAATA 


AACCACCAGC 


ATACTTTAAC 


AAATCCTCTA 


AATCTGCTGG 


AAACTCACTC 


18180 


ATGGCTGGTG 


AGGCGCCCAG 


TGCTACTAAT 


CCATTTGCTG 


TGAAATTTTT 


TACTACATCA 


18240 


TTGGTTATAC 


AAATGACCAA 


TGGTGCTTTT 


TCTTTTAATA 


ATTTTAAACT 


TGTCATATTG 


18300 


AAATCCTTCC 


TTTTCACTTT 


ATACGATCTA 


CTAATTTCGA 


TTTATCTTTA 


GTTGAGAATT 


18360 


TTTTTCATTT 


ACATTGAATG 


ATTTATACTC 


AATGAAAATC 


AAAGAGCAAA 


CTAGGAGGCT 


18420 


AACCGCAGGT 


TGCTCAAAAC 


ACTGTTTTGA 


GGTTGTGGAT 


AGAACTGACG 


TGGTTTGAAG 


18480 


AGATTTTCGA 


AGAGTCTTAC 


CTCATCAAAT 


TTGTAAATAT 


CATGAGCCTT 


CTCTAGACAT 


18540 


CGTAACCAAT 


ATCAAAAAAA 


GCTAATTCTA AAGCGACTGC 


TTGATTCCAG 


CGTTGCTGAA 


18600 


GTTCTGTCAA 


ATCTTCTCGA 


TTTTTACCGA 


CACGATTGAG 


TTCGTCAACC 


AGAAATTGAA 


18660 


CCCACTCTGC 


AAAGAAAGGA 


CCTCTGTGGA 


GATTGATCCA 


TTCCGAATGA 


ATATAGACTT 


18720 


CAGGTAAAGC 


CAAATCTTTA 


GAACCCCAGT 


CTAAATAGAG 


ACCTTCTGCA 


ATGACCAGCA 


18780 


TGACCAAAAG 


ATGGGCATAG 


TCTGATGAAG 


CCACCGCCGA 


ATACATTAGA 


TCCTGAAAGG 


18840 


CTTTTGTTAC AGGGTGCAAA 


GTCACTTCTA GATAGTCATT 


CTCTGCTACT 


TTTAACTCTT 


18900 


TAAAAGCCTT 


TTGGAAATAA 


CCATCTTCAT 


CTGCTTCAAG 


AAAGCCTAGT 


TGCTTGGCAA 


18960 


AACGAAGCTT 


GGATTCAAGT 


TTATCTGCGT 


GACTACGCAG 


GCACCCAGCA 


TGGATAAGAA 


19020 


GGCATCAAAG 


AAGTGATAAT 


CTTGAATCAG 


ATAGTCCTTT 


AAGACCTTAT 


TCTCAATTGT 


19080 


CCCCGCAAAA 


AGTTCCTTAA 


CAAAACGATG 


ATTGATTGCA 


GCCTGCCAAT 


CCTTCTGACT 


19140 


GCTTTTTAAT 


AATTCTCCAA 


CAGTCAAACC 


TGGCTGAAAT 


GCATAGTCTT 


GTGTTTCCAT 


19200 


ATTTACTTCT 


CCTCTCTTTA 


CTTGTTAGTA 


ATTAATAAAA 


CACCAAGAAA 


TATCAAGCAA 


19260 
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AATCGTAATT 


CCACTTGATC 


CTTTTAAAGC 


ACATCGAGAG 


CATTTGCAGA 


GAGCTAACTA 


19320 


AACAAGCCTA 


TCCAGTTTAT 


ATAAACAAAA 


AACTCCAATT 


ACAATCAAGA 


ATTAGAGTTG 


19380 


ACTTACAAGA TTAGACCGTT 


CATTTCACCA 


TACGAAAAAA 


CTGTTCACAT 


TTCCCTTCGC 


19440 




wt & ffii nan* 


TCAATGGGTA 


TTATCTCAGC 


CTAAAGCACC 


CCAAATGTCT 


19500 


LlAi 1A1 1 1A 


AL. 1 Al_ 1\jAAL 


CAGTATAGCA 


AAAAATGAAA 


GCCCTAGCAA 


GATATTTGAC 


19560 


CGAAAAATAT 




AATATATTGA 


AACTAGAATA 


GTACACCTCT 


ACTTATAAAA 


19620 


CATTGTTAGA 


AATCGA'lVTG 


ACTGTCCTGA 


TTGATTTGTC 


CTATTCTTAT 


TTCATTTTAC 


19680 


TATAGTTTTC 


GATAGCAATT 


TATTCTTCCA 


ATACACGAAG 


AAAAACCTCC 


ACATTCAGTG 


19740 


GAGGCAATCT 


GTTTTATCAA 


TACAATTTTA 


AGTCACGAGG 


GTCAACTGGG 


AAGGTTGGGT 


19800 


TGTATGGATT 


GTGACGGAGC 


TTGAAGTGTT 


TGACATCTTC 


AATGGTCTGA 


GTTCCAGACA 


19860 


ATTGCATAAC 


TGTCTTCAAT 


TCCGCATTCA 


AGTGTTCAAA 


GACTTGACGC 


ACACCGACAC 


19920 


TACCACCGAG 


AGCCAAGCCA 


TAGATGACAG 


GGCGTCCAAT 


AGCAACCAAG 


TCTGCTCCTG 


19980 


ATGCCAAGGC 


TTTAAAGACG 


TGTTGACCAC 


GACGAACACC 


AGAGTCAAAG 


ACAATCGGCA 


20040 


CACGTCTATC 


AACTGCTTCT 


GCCACTTCTT 


GAAGCGAGTC 


AAAGGCAGCT 


GGTCCACCGT 


20100 


CGATTTGACG 


ACCACCGTGCj 


TTGGTTACCC 


AGATACCAGA 


AGCTCCTGCA 


GCAAGCGAAC 


20160 


G 1 I CAACGTC 




TGTGGTCCCT 


TGACATACAC 


AGGAAGACCA 


GAGTATTCAG 


20220 


CGATAAATTC 


TACATCGCGT 


GGAGACAAGC 


GTTGTTTAGC 


TGATTTGtAA 


ACAAAGTCCA 


20280 


TTGATTTACC 


AGCACCTTCT 


GGCAGGTATT 


CTTCAACAAT 


CGGCATGCCA 


ACTGGGAAGA 


20340 


CAAAACCATT 


ACGCTTATCC 


ACTTCACGAT 


TCCCCCCTAC 


AGTAGCATCT 


GCCGTCAAGA 


20400 


CAATCGCTTT 


ATAACCTTCA 


GCCTTCACAC 


GGTCCATGAT 


GTGGCGGTTG 


ATACCGTCAT 


20460 


CCTTACTAAA 


GTAAAATTGA 


AACCAATGAG 


GTGTCCCTTG 


GAGGGCTTCA 


GAAATCTCTG 


20520 


GAAGGTCAAC 


AGTAGAGTAA 


GAACTGGTTG 


TATAAAGAGA 


ACCAAACTCA 


TGCACACCAC 


20580 


GCGCAGTCGC 


CACTTCCCCC 


TGTTCATTTG 


CCAATTTATG 


AGCCGCAACA 


GGTGCCATAA 


20640 


TGATTGGAGA 


AGATAGTTTT 


TCACCTGCAA ATTCAATCTC 


TGTACTTGGA 


TTTTCTACAT 


20700 


TGCAAAGTGT 


ATGAGGAACG 


ATGAGCTTGT 


GGTTAAAGGC 


ACGGATATTC 


TCTCTTAAAG 


207 60 


TGAAAGTATC 


TTCCGCCCCA 


CTAGCGATAT 


AGCCAAATGC 


TGCTTTAGGA 


ATAACTTGTT 


20820 


GCGCCATTGG CTCCAAATCA TAGGTATTGA TGAArTCTAC ATGACCTTCT 


GCATTGCTTG 


20880 


TTTTGTATGA 


CATAAAATGT 


CCTCCTTAAT 


AAGTAAGCGT 


TTACTTTGTG 


TATTACAAAA 


20940 


ATATCTTAAC 


TCTTTTTCAA 


AACTTTTAAA 


ATATTTTGTT 


TGGAAATTTC 


AGAAATTTTA 


21000 


TGTCTATGAT 


AAAAATCCTT 


ATAACGGCAA 


TAAAAAATAG 


ATATTATCCA 


AAGAAGATTT 


21060 
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TAAGTGCTAC AATAACTGTA TTATTTCTAG ATGGGAGGTT CTATTTTTGG 


ATTGATCCAT 


21120 


TGTTGAACAA TATCTACCAC TATATCAAAA GGCATTCTTT CTGACCTTGC 


ATATTGCAGT 


21180 


TTGGGGAATT TTGGGATCCT TTCTGCTCGG TTTAATCGTT AGTATCATCC 


GACATTATCG 


21240 


AATCCTTGTT TTGGCGCAAG TAGCGACAGC CTACATTGAA TTGTCACGTA 


ATACGCCCCT 


21300 


TTTGATTCAA CTCTTCTTTC TCTACTTCGG TCTTCCCCGA ATCGGGATTG 


TCCTATCTTC 


21360 


AGAAGTCTGT GCAACGCTTG GGCTTGTCTT TTTAGGAGGC TCCTATATGG 


CAGAATCTTT 


21420 


CCGAAGTGGG CTGGAAGCCA TCAGTCAAAC CCAGCAGGAG ATTGGCCTCG 


CTATTGGTCT 


21480 


GACACCTCTA CAGGTCTTTT ACTATGTGGT TCTTCCGCAA GCAACAGCGG 


TGGCACTCCC 


21540 


CTCCTTTAGT GCCAATGTCA TTTTCCTTAT CAAGGAAACC TCTGTTTTCT 


CAGCAGTGGC 


21600 


TTTGGCCGAC CTCATGTACG TCGCCAAGGA TTTGATTGGT CTCTACTATG 


AGACAGACAT 


21660 


TGCGCTAGCT ATGTTGGTAG TTGCTTATCT AATCATGCTG CTACCCATCT 


CACTGGTCTT 


21720 


TAGCTGGATA GAAAGGAGGC TCCGCCATGC AGGATTCGGG AATCCAAGTA 


CTCTTTCAAG 


21780 


GAAATAATCT CCTGAGAATC TTACAGGGAT TGGGCGTTAC GATTGGGATA 


TCCATCCTGT 


21840 


CTGTCCTCTT ATCCATGATG TTCAGAACAG TCATGGGAAT CATCATGACC 


TCCCATTCTA 


21900 


GAATCATACG ATTTTTAACA CGATTGTATC TGGAATTTAT CCGTATCATG 


CCCCAGCTGG 


21960 


TGCTACTCTT CATCGTTTAC TTTGGCTTGG CTCGAAACTT TAATATCAAT 


ATCTCAGGTG 


22020 


AGACTTCAGC TATTATCGTT TTTACCCTCT GGGGAACAGC TGAAATGGGA 


GACTTGGTAC 


22080 


GTGGAGCTAT CACTTCTCTC CCTAAACATC AGTTTGAAAG TGGACAGGCA 


CTCGGCTTGA 


22140 


CTAATGTTCA ACTTTACTAC CACATCATCA TCCCACAAGT CTTAAGAAGA 


CTGCTACCGC 


22200 


AGGCTATCAA TCTTGTCACT CGGATGATTA AAACCACTTC ATTAGTTGTT 


TTGATTGGGG 


22260 


TTGTGGAAGT GACCAAAGTT GGACAACAAA TCATCGATAG CAATCGCCTG 


ACCATCCCAA 


22320 


CTGCTTCATT TTGGATTTAT GGAACCATTC TAATCTTATA TTTCGCAGTT 


TGCTACCCTA 


22380 


TTTCCAAACT ATCCACTCAC TTAGAAAAAC ATTGGAGAAA CTAAATGTCT 


GAAACTATCT 


22440 


TAGAAATCAA GGAACTAAAA AAATCCTTCG GAGACAATCC CATCCTCCAA GGACTTTCTC 


22500 


TAGAAATCAA AAAAGGGGAA GTTGTTGTCA TCCTAGGGCC ATCTGGTTGT 


GGGAAAAGTA 


22560 


CCCTCCTTCG TTGCCTCAAC GGCTTAGAAA GTATTCAAGG TGGAGATATT 


CTTCTGGATG 


22620 


GTCAGTCTAT CGTTGAAAAT AAAAAAGATT TTCACCTAGT TCGCCAAAAG 


ATTGGCATGG 


22680 


TCTTTCAAAG TTATGAACTC TTTCCCCATC TGGATGTCTT ACAAAACCTC 


ATCCTAGGCC 


22740 


CTATCAAAGC TCAAGGAAGG GACAAGAAAG AAGTAACGGA AGAAGCTTTG 


CAATTACTAG 


22800 
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AGCGTGTCGG TTTGCTGGAT AAACAACATA GCTTTGCCCG TCAATTATCT GGTGGACAGA 22860 

AGCAACGTGT TGCAATTGTC CGTGCCCTCC TAATGCATCC AGAAATCATC CTTTTTGACG 22920 

AGGTGACTGC TTCGCTGGAT CCAGAAATGG TGCGTGAGGT GCTGGAACTT ATCAATGATT 22980 

TGGCCCAAGA AGGCCGTACC ATGATTTTAG TAACCCACGA AATGCAGTTT GCCCAAGCCA 23040 

TTACTGACCG GATTATCTTC CTCGACCAAG GGAAAATCGC TGAAGAAGGA ACAGCTCAAG 23100 

CCTTCTTTAC CAATCCGCAA ACCAAACGAG CCCAGGAATT TTTAAACGTC TTTGACTTTA 23160 

GCCAATTCGG CTCATATCTA TAAAGGAGAT TCTTATGAAA CTATTCAAAC CACTCTTAAC 23220 

TGTTTTAGCA CTTGCCTTTG CCCTTATCTT TATCACTGCT TGTAGCTCAG GTGGAAACGC 23280 

TGGTTCATCC TCTGGAAAAA CAACTGCCAA AGCTCGCACT ATCGATGAAA TCAAAAAAAG 23340 

CGGTGAACTG CGAATCGCCG TGTTTGGAGA TAAAAAACCG TTTGGCTACG TTGACAATGA 23400 

TGGTTCTTAC CAAGGCTACG CTACGATATT GAACTAGGGA ACCAACTAGC TCAAGACCTT 23460 

GGTGTCAAGG TTAAATACAT TTCAGTCGAT GCTGCCAACC GTGCGGAATA CTTGATTTCA 23520 

AACAAGGTAG ATATTACTCT TGCTAACTTT ACAGTAACTG ACGAACGTAA GAAACAAGTT 23580 

GATTTTGCCC TTCCATATAT GAAAGTTTCT CTGGGTGTCG TATCACCTAA GACTGGTCTC 23640 

ATTACAGACG TCAAACAACT TGAAGGTAAA ACCTTAATTG TCACAAAAGG AACGACTGCT 23700 

GAGACTTATT TTGAAAAGAA TCATCCAGAA ATCAAACTCC AAAAATACGA CCAATACAGT 23760 

GACTCTTACC AAGCTCTTCT TGACGGACGT GGAGATGCCT TTTCAACTGA CAATACGGAA 23820 

GTTCTAGCTT GGGCGCTTGA AAATAAAGGA TTTGAAGTAG GAATTACTTC CCTCGGTGAT 23880 

CCCGATACCA TTGCGGCAGC AGTTCAAAAA GGCAACCAAG AATTGCTAGA CTTCATCAAT 23940 

AAAGATATTG AAAAATTAGG CAAGGAAAAC TTCTTCCACA AGGCCTATGA AAAGACACTT 24000 

CACCCAACCT ACGGTGACGC TGCTAAAGCA GATGACCTGG TTGTTGAAGG TGGAAAAGTT 24060 

GATTAGTCAT TAACTCTTAA AAGGAACTGG ATTTTAAGCT CCAATCCCTT TTTAAGATTT 24120 

TACCTATAAC ATCCTGAGTC TATCTAAGAT GTTCAATCTG AACACAGTGT ACATACTTTA 24180 

TCTTCTATTG CATATACTTT ATCACATAAG ATACGAATAT CCTCTTCACT ATGACTAGCA 24240 

ATCAAAATTG TTGTCCCTTT TTCACTAGAG AGCTTTCTAA ACAATGTTCT CATATTTTCT 24300 

ACACTTGATT TATCCAAGGC ATTCATAGGT TCATCTAGTA AAAGAATAGA GGGATTCTCC 24360 

ATAATTGCTT GAGCAATCCC TAGCTTTTTC CTCATACCTA GCGAATAAGT TTTAACTTTC 24420 

TGGTCTTTTT GCTCATATAG ACCAACTATT TTCAGTGTAT CATTGATTTC CTGATTACCA 24480 

ACTACTCCTC GTATGCTTGC CAAATATTGT AAATTCTTAA AGCCACTATA ATAATTTATA 24540 

AAACCAGGTT CTTCAATCAA AGCTCCCAAA TTAGCTGGAA TTTTTCTCTC AGGAACAATA 24600 
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TTTTCCCCAT TGATTAACAC TTCTCCATAA GACGGACTAT ATAAACCAGC TATTAATTTA 
AACAATACAC TTTTCCCTGA GCCATTCGCA CCAGTAATTC CTATAATTTC CCCCTGTTTA 
CAACTAAAGT TAAGGTTTTG AAAAACACAT GTCTTTTTTA ATTTCAACTC AATATTTTTT 
AATGTAATTA TTTCATTCAT TCTATAAACC TCCTCTTTTG ACGAGTGAAA TAGAAAATGC 
TTTGAAAAAG AAAGACTAAA AATAGCAAC? GAAGAAATAA ATCTCGTCCT ATATCTCCAT 
-TCCCTCGATT CAAAATATAA AATAGATAAT TAGTTCGATT TCCTACAAAT AGACCACCAA 
ACACAATCAT GAGTAAAAAG AAACTAACGC AAGCAAAGTT CG 
(2) INFORMATION FOR SEQ ID NO: 49: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11443 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
CAGGTACGGT GAGGCGCAAC TAAAATATAA TTTTCATCTT GATTAGGAAT TTTATCAGTA 
TTATGATAGT GAGCATTGCC ATTGATGGAC CATAAGAGCA ATACAACTAA TCCACGCAAA 
TAAGTATAAA ACATGCGATC TCCTTCGATT GTTTTCTTGT TATTATTATA CCTTATCAAA 

GGAGGGCTGG CAAACTTTTC CCTTGACTAG ATACATATTT AGGATGAAAT TAGAATTCTG 240 

TTAAAAAAAA TGATATAATA GAATTTATGG ATAAAAATAA GATTATGGGA TTAACCCAAA 300 

GAGAAGTCAA GGAAAGACAG GCTGAGGGTT TGGTCAATGA CTTTACCGCA TCAGCCAGTA 360 

CCAGCACTTG GCAAATCGTT AAACGAAATG TCTTTACCCT TTTTAACGCT TTGAACTTTG 420 

CCATTGCTTT GGCTCTTGCC TTTGTGCAGG CTTGGAGCAA TCTGGTCTTC TTTGCTGTTA 480 

TCTGCTTTAA CGCTTTTTCT GGGATTGTGA CCGAGCTACG AGCCAAACAC ATGGTGGACA 540 

AGCTCAATCT CATGACCAAG GAAAAGGTCA AAACCATCCG TGATGGTCAG GAAGTTGCTC 600 

TTAATCCTGA AGAATTAGTG CTAGGAGATG TCATTCGTTT GTCTGCAGGA GAGCAGATTC 660 

CTAGTGATGC CTTGGTTTTG GAAGGCTTTG CGGAAGTCAA TGAAGCCATG TTAACGGGAG 720 

AAAGTGATTT GGTGCAAAAG GAAGTTGACG GCTTACTTTT GTCAGGAAGT TTCCTAGCCA 780 

GTGGGTCAGT TTTATCTCAA GTTCACCATG TCGGTGCAGA CAACTATGCT GCCAAACTCA 840 

TGCTTGAGGC TAAGACCGTT AAACCCATCA ACTCCCGTAT CATGAAATCG CTGGACAAGT 900 

TGGCTGGTTT TACTGGGAAG ATTATCATTC CCTTTGGTCT GGCTCTCTTG CTGGAAGCCT 960 



24660 
24720 
24780 
24840 
24900 
24960 
25002 
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TGCTTTT AAA 
GAATGTTGCC 
AGTTGGGCTT 
TGGATATGCT 
CTGTTCTTCC 
ACATGGCCCA 
GAGATGTTGC 
CTATGGAGTT 
CTGAAGTCCC 
TCAGTCAGGA 
CCTTGCTGGA 
GTTCTCAGGA 
TTGCCCAGAA 
ATGAGGAATT 
AAAAGAAACT 
ACGGGGTTAA 
GGGATCCAGC 
TTCCTGAGAT 
TTTTCTTGAT 
TACTAGGTCG 
TTGACCAGTT 
CTGTTGAGCA 
TCGTCTTCAG 
AAATCTCAAC 
CCTGCATGCC 
TAGCCACAGC 
AAACGTTGCC 
GTCGTTACCA 
GGCTATAAGC 
CCACTTTCCC 



AGGCCTGCCT 
TAAGGGAATT 
GAAAAAGGTC 
CTGTCTGGAC 
GTTGACGGAA 
TAGTGAGGAT 
TTATCCTATG 
AGAAGGCTTG 
AGAAGCTAGG 
GAAATTAGAC 
AATCTTGGAC 
GGTGGGACTC 
GGCTGGTTTT 
GATGGCCATG 
CATCATCCAA 
TGATATCTTG 
AACCCGTCAG 
TCTCTTCGAG 
AAAGACCATC 
GTCAGAGTGG 
TGTGGAAGGT 
GAATTTCCTC 
CGTCCTGTTT 
TCTACTCTAT 
ATTTACCCTA 
TCTCTTCCCA 
TGTTTATGGT 
AGCGAAAAAA 
CGCTTCTACC 
GAGCAGGTGC 



CTCAAGTCAT 
GCCCTTTTGA 
TTGGTGCAGG 
AAGACGGGTA 
ACGTATGGTG 
AAGAATCCAA 
ATTTCCAATC 
GGGACAGTTT 
GAGGCCTTGG 
CATCACAAAC 
CCCATTCGAG 
AAGATTATCT 
GCGGACTATC 
GCGGAGGAGA 
ACGTTGAAAA 
GCCCTTCGTG 
ATTGCCAATC 
GGTCGTCGCG 
TATTCCTTCC 
ATTTTGATTT 
TTCCCACCAT 
AGAAAATCCA 
GTGAAAATGT 
TATCTCTTGG 
TGGCGTGTCC 
AGAATTCAAA 
GTCATGATGT 
TAAATCAAAA 
GGCCAGGGCC 
TAAAGCACCT 



446 

CCGTTGTAAA 



CCATTACTTC 
AGATGTACTC 
CCATCACCCA 
AAGAGGCTAT 
CTGCCCAAGC 
TTCCCTTCTC 
TCTTAGGGGC 
AGAGAGGATC 
CACAGAAACC 
AGGGAGCAGC 
CTGGTGACAA 
ACAGCTATGT 
CAGCTATTTT 
AAGCGGGACA 
AGGCGGATTG 
TGGTTCTCTT 
TGGTCAATAA 
TGTTAGCAGT 
TCCCCTTCAT 
TCGTTCTGAC 
TGCTTCGTGC 
TTGGCGCGAG 
GGTCAATTGG 
TCTTGATTGT 
AACTGCTTGA 
TGGTCTTTAC 
CCACCAGTGT 
AAAGGCCCAC 
TAGTTACTTC 



CTCGTCGACA 
GCTCTTGACT 
TGTTGAGACC 
AGGAAAGATG 
TGCCAGCATC 
CATTCGCCAG 
GAGCGACCGC 
ACCTGAGATG 
ACGTGTCTTG 
ATCTGATATT 
AGAGACGCTG 
TCCAGTTACG 
AGATTGCTCA 
CGGACGTGTT 
TACAACGGCT 
TTCTATCGTG 
GAACTCAGAC 
CATTGCCCAC 
CATCTGTATT 
TCCGATCCAG 
TTTTGAGCGA 
CCTACCAAGC 
TCAAGGTTGG 
TTTCTTATCC 
TTGGTCAGTA 
AATTTCAACC 
CGTGATTTTC 
GAACTGGTGG 
CGAAATAGCT 
CTCTTATTTA 



GCTCTTTTGG 
GCAGTGATTA 
TTGGCGCGCG 
CAGGTGGAGG 
TTGACTAGCT 
CGTTTTGTGG 
AAGTGGGGGG 
TTGCTTGATT 
GTCTTAGCTC 
CAGGCTCTAG 
GACTATCTCC 
GTGTCCAGCA 
AAAATCACCG 
TCCCCTCATC 
ATGACAGGGG 
ATGGCGGAGG 
TTTAATGATG 
ATCGCCCCGA 
GCCAGTGCTT 
ATTACCATGA 
AATATCAAAC 
GCTCTCATGG 
TCTGAGTTAG 
GTATTTAGAG 
GGAGGTTTCC 
TTAACAGAAC 
ATCCTGACCA 
TTTGTTCTGC 
TCCTCGCGCA 
TTTCGCCAGT 



1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 
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AAACGGATCT ACTGACTCGA ATAACGTGAG CTGGTCTGCT ACTCTGTCTT Cra TAATTC 
ATTCTGAATA TATTCAGCTA TCACTTTCTG ATTACGGCCT ACCGTATCTA CATAATAGCC 
•TCTACACCAA AACTTGCGAT TGCCATATTT GTATTTTAAA TTCGCATGCT TATCAAAAAT 
CATCAAACTG CTCTTGCCCT TTAAATAGCC CATAAAGGAC GAAACACTAA GTTTCGGAGG 
AATACT6ATA AGCATGTGAA TATGGTCTGA ACAAGCATTC GCTTCATGGA TTATTACACC 
CTTACGCTCA CATAAGTCAC GTATGATTCT TCCGATACTA GCTTTGTATC TGCCATAAAT 
GATTTGACGA CGATATTTGG GTGCAAAAAC AATATGATAT TTACAATTCC ATGTGGTATG 
TGATAAACTT TGATTATCCT CTCTCATGAG GTACCTCCTG TATGATATGT TGTAGTGGCG 
GAGAAACCAC TTCTATCTTA TCATTTTAGG AGGTTCTTTT TGTTACCACG CTAAAAGCTC 
TATGGAAcCA CTAGCATAGC TAGTGGTTTT CGGGAGACAA CAAGAAAGAC TGCAATCTGT 
GGATTGCAGT TTTTTATACG ATGGATCTAT CGTAGATCTG ATGTGCAAGG CCTACGTGCC 
GATCATCTAT CGGTGAACCC AAGAGCGACC CTCAAGCCTG CTTGGATTGA GGTAATAGAT 
TCAAATATCT GTAGTTAGAC TATTTGAAGT TTGATGTAAG AAAGAGAAAG CGACAGATTG 
AAGTAATTTT AACTCTCTTC TATTGCTAGA ACAAATGGTC GGATAGGTTG GTAGTTTGAA 
AATGAAGATG CTATCTATTG TTAAATGGAA CATAGTGTTA TTTATTAGAA AATCGTTTGG 
TTTATTTCTT ATCAAATACG AAAAGCAACT TAAATATTTC AACTAAAATA GATGTTATGA 
AGAAAAGGTA AAATGATTTT GGCATAGTGA GGTTCTGTTC TATTTGATAT CATATTTTTG 
ATAAAAACAA AAATGTCCAT TGCAAAGGAC AAAATGCGAA GTATATTATT TTTTGAAAGC 
GATATAATGG ATTCATAAAG GAGGTGTATC GTGTCTAGAA AACAAGAACA AATGGAAACG 
TTGTTGCTCC TTTTGCGAGA TAGTAAGGAT TATATATCTG CTAAAGTATT GGGAGAAAAA 
TTAAATTGCT CTGATAAAAC GGTTTATCGC CTTGTCAAGG GAATCAACAA AGATTGTCCG 
GTAGAAGCAT TCATTTTATC TGAAAAAGGC AGAGGTTTCA AATTAAATCC AAGAAGTTCC 
CTOGTGGACG TTGATGGGAA TTTTACAGAG GCTTTTGATC CTGAAGTAAG GCGTGAAAAA 
TTACTAGAAC GTCTCTTGTT GACTGCTCCT AAGCCACATT CTATTTATGA TTTAGGAGAG 
GAATTCTACG TAAGOGAGTC AGTAGTACTA AAAGATCGTC AGATATTACA AGAGAGTCTA 
GCAATTTATG GGTTAGATTT AAAAATGAGA CAACGAAAGC TTTTTATTGA TGGGGATGAG 
GCTCAAATTC GTTCAGCCAT TCTAAATCTA CTGCCAATGT TTAATCAGTT GGATTTAGAG 
CAAATTACAC AGAATAAGGT TCAGCCTCTT GACGGAGAAC TTGCTCACTT TTGTTTGGGA 
TTACTGATTA CACTTGAGAG AGAATTGGGG GTAAACATTC CCTATCCATA TAATATAAAT 



2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 
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ATTTTCTCTC ACCTGTATAT TTTTATCAGT AGGAATCGTC GTAGTACTAG TATTCATGTT 4560 

GTAGCACCTT CAAAACCTAC TATTGTTGAT GAGAAAATTT ACAGTGTCTG TCAAAAAATT 4620 

ATTCAAGAAA TTGAACAATA TTTTAGGATG AAGGTTGATG CAGTTGAGAT TGACTATCTT 4680 

TATCAATACG TTGTATCTTC GAGATTGCAA AAACCATTTT CTTCCGGGAA GCTTCCTTTT 4740 

TCTCAGCGAG TTTTAGATGT CACTCATTAC TATTTTAGCC GTATGTGTAT GGACAATAGA 4800 

GAGATTGAAA CGACAGATCC TGACTTTGTT GACTTGGCGA GTCATATCAG TCCCTTACTG 4860 

AGGAGATTAG ATAATAGAGT ACAGATTAAG AATAGTCTTT TATCACAAAT TCTTTTAACC 4920 

TATCCTAATC TGGTTAAAGA GTTAACAACT ATTTCTAAAG AAGTGAGTCT AGTATTTGGT 4980 

TTTGCTTCCT TGAGTCTGGA CGAGATTGGT TTTCTAGTCT TATATTTTGC ACGGTTTCAA 5040 

GAAAAGCGAG CACGTCCTCT AAAAACAGTA GTGATGTGTA CATCAGGTGT CGGAACTTCA 5100 

GAGCTTTTAC GAGCACGATT AGAAAAGCAA TTTTCTGAAT TGGATATTAT TGATGTAGTT 5160 

GCTTATCATC AATTAGATGA GCTGATAAAT CTATATCCAG ATTTAGATTT CATTGTGACG 5220 

ACGGTAGCTT TGCAGGAACC AGCAAGTGTC CCGTTTGTCC TAGTTAGTGT TTTTCTAACC 5280 

GAGGGTGATA AACAACGTCT TCAAGCAAAA ATTCAGGAGA TAAACTATGA ATAATCTTTC 5340 

GCTTGTCCTT ATGGATATAT CTGTTCAAAA TCGTCAAGAA GCCTACAAAG AATTAGCAAA 54O0 

TCAAATCAGC CTTCTTGTTT CTGAAGATAC AGAAAAAATA GAAGAGCTTC TATATTACCG 5460 

T GAG AG AC AG GGAAGTATAG AGGTTGCTAA AGGTGTTCTT CTACCACATT GTGAAGGAAA 5520 

CTTTCAACAT CATGTCTTAG TGATTACTAG ATTAAAATCA CCTATCAGAG AATGGTCGAA 5580 

GGATATCCAG TGTGTTGACC TTATTATCGG TTTGGCCATT GCAGTATCAC AGGACAAGTC 5640 

ATGTATTAAA ACATTGATGA GAAGACTAGC AGATGAATCA TTCATAAATC AATTAAAACA 5700 

GTTAACAAAA GAAGAATTAC GGGAGATAAT ATATGGAAAT CAAAGATATT CTTAATGTGA 5760 

GTCTGATCCA GACGGATTTA CAGATGCAGA GCAAAGAAGA GGTTTTTGAG GCATTAGCTC 5820 

AACTATTGGT TGAGACGGGT TATGTGTCTG ATAGAGACCA ATTTATCGAA GGTCTTTATC 5880 

AGAGAGAGGC AGAAGGACAG ACCGGTATTG GGAATTATAT TGCTATTCCC CATAGCAAGA 5940 

GTTCTGCTGT GGAGAAGGCG GGGGTAGTCA TAGCTATAAA TCACAATGAG ATTCCTTGGG 6000 

AGACCATTGA TGGGAAAGGG GTCAAAGTAA TTGTACTCTT TGCAGTTGGT GATGATACAG 6060 

AAGCTGCTAG GGAGCATTTG AAGACCTTAT CACTCTTTGC TCGAAAACTT GGTAATGACG 6120 

AAGTTGTTGC CAAATTAGTT CGGGCTCAGA CATCTGATGA TGTGATTGCA GCTTTTTGTT 6180 

AATAAGAAAA AATTTTGGAG GGTATCCGTA TGAAAATTGT TGGTGTTGCA GCTTGTACTG 6240 

TGGGAATTGC CCACACTTAT ATTGCACAGG AAAAATTAGA GAATGCCGCA AAGGTAGCTG 6300 
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GACATGTGAT 
AGCAGATTGA 
AACGCTTTGA 
ATAAACTGAT 
AAATATATGT 
ATTCCAATTG 
GTTCCTGACG 
GGTAAAGCCC 
GGTAAGCCAG 
GGGTTTATCG 
AAAAAGGTCA 
GTAGCCTCTT 
TTTACCAACT 
GGGGCAGTTA 
TATGCGTTTG 
TTGGTGAATA 
AAAAAAAATA 
ATTGTCAATA 
ATTGCAACAG 
TCTGCTGTGC 
ATTTGTGCCT 
CCAATAAAAC 
GAAATTTTGT 
GGATTTGGAC 
TATCGATATT 
AGAAGTTCAA 
CTTTTGGGTA 
TCTGAATGGT 
TGTTGTCCTT 



TCATGTTGAG 
TGCAGCGGAT 
GGGTAAAAAG 
TGCTAAAGCT 
TGAAACACTT 
TTTGTGGTGC 
CTCTTGTAGC 
TTGGTCTCTT 
GGATTGCACC 
GTGGTATCTT 
AAGTACCAAA 
TGGTAAGTAG 
GGTTGACGAG 
TTGGAATTCT 
TGTTGACTTT 
CTGCTACACC 
TCTATACTCA 
TTGTTGAAGG 
GTATCGGTGG 
CATTTGGTGG 
TGTTAGCTAA 
ATGCAGAACC 
AAGAGGGTAA 
AAATTCAAAG 
ATGGATGGCC 
AAAATTAGTG 
GATCAAGTTC 
CTTGCTTTTC 
AATCCTGAAA 



ACTCAGGGGA 
GTAGTTATTT 
ATTATCAAGG 
GTTGAGATTG 
AAACTTAAAA 
AGGATTCTTA 
AGGAAAATTC 
GCCAGTTGTT 
AGGTTTTGTT 
GGGAGGTTAT 
CTGGATTAAA 
TTTGATTATG 
CTTATTACAA 
CAGTGCTGTT 
ACAGGCTGAA 
AGTTGGATTT 
AGAGGAAATC 
TGTAATTCCG 
TGCTGTTGGT 
AGTGCTTATG 
CATTGTAGTC 
AGTTATGACT 
CGATGTCAAG 
AGCAGATTAC 
ATTTTGTTCC 
ACACACCTTT 
TCGATTTACA 
GTTTGATTGA 
CACCTGTTTC 



449 
CAATAGGGGT 
TAGCAGTTGA 
TTCCAACAGA 
TTACGAAATA 
GGTCACTTAT 
GTTGCCATTG 
ACTATCTGGG 
ATTGCTACAG 
GTTGGTCTAA 
ATAGCTGGTT 
GGTTTAATGC 
ATTTATATTA 
AGCTTGGGAA 
GACTTTGGTG 
GGTGTGAAAG 
GGATTGGCCT 
GAAACATTGA 
ATTGTTATGA 
GGTGCTGTTT 
TTACCAACCA 
ACAGGACTTG 
GTTGAAGAAG 
AATTGAATTT 
TTTTTTGAAT 
CAATATTACC 
ATCAGTTCAT 
ATGTGAGTAT 
TAAAATTCAT 
TACAATCTTT 



AGAAAATGAA 
TGTTAAGATT 
AGTGGCAGTC 
ACTGAAAATA 
TGACAGCCAT 
GTTTAGCAAT 
ATGCTTTAGC 
GTTTGTCTTA 
TTGCCAATTC 
TCTTGGTTCA 
CAACCTTGAT 
TTGGAGCGCC 
GTGCTrCAAA 
GCCCACTTAA 
AACCATTGAC 
ATTTTATCGC 
AATCGGCTGT 
ATAACTTGGT 
CTTTGACAAT 
TGACTCGTCC 
TCTACGCGAT 
AGATTGATTT 
TCACCATCTT 
GATAAAGTAG 
TTGTCTCCTT 
CTGATGGTCA 
ATTTGTATTC 
GATGCAGGTC 
CCCTACATTG 



TTGAGTCAAG 
TCTGGTATGG 
AAATCTCCCA 
TTTAAGGAGA 
TTCCTATATG 
GGGGGGTGGT 
AACTATGGGT 
CTCGATTGCT 
TGTTGGTTCA 
AGCGATTATT 
TATTCCTTTT 
TATCGCAGCC 
TGGTTTGATG 
TAAAACAGTC 
TGCTTTACAA 
GAAATTACTC 
TCCTATGGGG 
TCCAGGTCTC 
GGGTGCTGAT 
AGTAGCTGGT 
TTTGAAAAAA 
GTCAGATATT 
TGATGACCAT 
CATCTTATCA 
GGTTCATTCA 
CAGACCCAAC 
ATGCTGAAGT 
TAAAGGCTGG 
ATTTACTTGA 



5360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 
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CAAAGCAACT ATTATGACTG TAGATCCAGG TTTTGCAGGA CAACGCTTTT TGGAGTCTAC 8100 

CTTGTATAAA ATCCAAGAAC TCCGTCAGCT TAGAGTTCAG AATGGTT AT C ACTACATCAT 8160 

TGAGATGGAT GGTTCTTCGA GTCGTAAGAC TTTCAAACAA ATTGATGTGG CAGGACCAGA 8220 

TATTTATGTT ATAGGTCGCA GTGGATTATT TGGTTTGGAT GACGATATTG CCAAAGCCTG 8280 

GGATATCTGT TCTAGAGATT ACGAAGAAAT GACCGGAAAA ACAATGCCAA TCAAATAATG 8340 

GTTTGAGAAG AAATTTATTA GTTAGGAGGA ATATATGTCA CTACAATCAG TTAACGCCAT 8400 

TCGTTTTCTT GGCGTAGATG CTATTAACAA ATCTAATTCT GGTCACCCGG GAATTGTCAT 8460 

GGGTGCTGCG CCAATGGCTT ATAGCCTATT TACAAAGCAC CTTAGAATTA CACCTGAGCA 8520 

GCCAAACTGG ATTAACCGAG ATCGCTTTAT CTTGTCTGCG GGTCATGGAT CAATGCTACT 8580 

GTATGCTCTC TTGCATTTAA CAGGGTATAA GGATGTATCC ATGGACGAGA TTAAAAATTT 8640 

CCGGCAATGG GGATCTAAGA CACCTGGTCA TCCTGAAGTG ACGCATACGT CTGGTGTGGA 8700 

TGCGACATCT GGTCCGCTTG GTCAGGGGAT TTCTACTGCC GTTGGTTTCG CCCAAGCAGA 8760 

GCGTTTTTTA GCTGCTAAGT ACAACAAAGA TGGTTTCCCT ATTTTTGACC ATTATACTTA 8820 

TGTTATCGCT GGAGACGGTG ACTTCATGGA AGGAGTGTCT GCGGAGGCGG CTTCTTATGC 8880 

AGGTCATCAA GCTTTAGATA AGCTTATCGT CCTCTACGAC TCCAACGACA TCTGCTTGGA 8940 

TGGTGAGACC AAAGATACTT TCTCTGAAAA TGTTCGCGTC CGTTACGATG CTTATGGTTG 9000 

GCATACAGTT CTGGTAGAAG ATGGAACAGA TTTAGCAGCA ATTTCTACAG CAATTGAGAC 9060 

GGCCAAGTTT TCTGGTAAAC CGAGTTTGAT TGAAGTGAAA ACGGTAATTG GTTACGGCTC 9120 

ACCCAATAAA AGTGGTACAA ATGCTGTTCA TGGTGCACCA CTAGGAGCAG AAGAAACAGG 9180 

AGCAACTCGT AAGTTTTTGG GATGGGATTA CGATCCATTT GAAGTACCAG AGGAAGTATA 9240 

TTCTGATTTC AAGACAAATG TAGCGGATCG TGGTCAGGAG GCATACGATG CTTGGGCTAG 9300 

TTTGGTGTCT GATTACAAGG TTGCTTATCC CGAAGTTGCT AGTGAGATTG ACGCTATTGT 9360 

AGCTGGAAAA TCCCCTGTAA CCATTACTGA AAAAGACTTC CCTGTCTATG AGAATGGCTT 9420 

CTCTCAAGCA ACTCGTAATT CGTCCCAAGA TGCTATTAAT ACAGCAGCAG TTTTACCAAC 9480 

CTTCTTAGGT GGATCGGCAG ACTTAGCTCA CTCTAACATG ACCTACATCA AGGCAGATGG 9540 

CTTACAAGAT AAATATAATC CATTAAACCG CAATATTCAG TTTGGGGTAC GTGAATTTGC 9600 

CATGGGAACA ATCCTCAATG GAATGGCTCT TCATGGTGGT TTACGAGTTT ATGGCGGAAC 9660 

CTTCTTTGTT TTCTCTGACT ACGTCAAAGC TGCTATTCGG CTATCAGCCA TTCAGGAGTT 9720 

GCCTGTAACT TATGTCTTTA CCCATGATTC AATTGCCGTT GGTGAAGATG GTCCAACTCA 9780 

TGAACCAGTT GAACATTTGG CAGGTTTACG CTCAATGCCA AACTTGACTG TTATCCGTCC 9840 
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AGCGGATGCC CGTGAAACTC AAGCGGCTTG GCATCATGCC TTGACCAGTA CCACCACTCC 9900 

AACTGTCATT GTCTTAACCC GTCAAAACTT GGTAGTTGAA GAAGGGACAG ACTTTGGTAA 9960 

GGTCGCTAAA GGAGCCTACG TCGTGTATGA TACCCCGGGA TTTGATACTA TTATCATTGC 10020 

TACAGGATCT GAGGTCAATC TAGCTATCAA AGCTGCTAAG GAATTGGTTT TACAAGGTGG 10080 

TAAAGTACGT GTGGTATCTA TGCCCTCAAC CGAACTATTT GATGCTCAAG ATGCTACCTA 10140 

CAAGGAAGAC ATTTTACCAT CTAAGACTCG TCGTCGTGTG GCCATTGAAA TGGCAGCGAC 10200 

CCAAAGTTGG TACAAGTATG TTGGTTTGGA TGGCGCGGTC ATCGGTATTG ACATCTTCGG 10260 

TGCGTCTGCC CCAGCTCAGA CTGTGATTGA TAATTATGGA TTTACGGTAG AGAATATCGT 10320 

TGCTCAAGTT AAGTCCCTAT AGAAACCAAT TACAATGAAG ATACAGCTGT TGTCAGACTA 10380 

GCAGATGTAG TGATAGACAC TAATCAGATG ATTGGTTATT TAAAAACTGT AATGAAAATG 10440 

TAATAATTTA TCTACGAAAG TTATAGTAGA TAGTATACAC AATAGAGTAT ACCCTGAAAC 10500 

GGTTGCGAAG TACGCTAATC ACTTTGCTAC TGATCTAGAT AGTTTCTTTA ATCAATAAAC 10560 

ACAGCATCCA CAGATTGACT TAGGATATTG TAAGTTTTTT GAAAGCTAGA GAGAAGGTCT 10620 

CTAAAATTAA AAAACGCATA GTATAGGATG TTGAAATGAT GAACTGCACC CCAAAAGTTA 10680 

GACAGAAAAA AATCTAACTT TTGGGGTGTT TTTATTATGA AATTAACTTA TGATGATAAA 10740 

GTTCAGTTCT ATGAACTTAG AAAACAAGGA TATATCTTAG AGAAGCTTTC AAATAAATTT 10800 

GGGATAAATA ATTCTAATCT TAGGTACATG ATTAAATTGA TTGATCGTTA CGGAATAGAG 10860 

TTCGTCAAAA AAGGGAAAAA TCGTTACTAT TCTCCTGATT TAAAACAAGA AATGATTCAT 10920 

AAAGTCTGAC ATGAAGGCTG GACTAAAGAT AGAGTTTCTC TTGAATACGG TCTCCCAAGT 10980 

CGTACGATAC TTCTTAACTG GCTAGCACAA TACAGGAAAA ACGGGTATAC TATTGTTGAG 11040 

AAAACAAAAG GGAGAGTACC TGAGAGCGGA GAATGCCATC CTAAAAAAGT TAAGAGAACT 11100 

CCGATTGAAG GAGGAAAAAG AGAAATAAGA AAGACAGAAA TTGTTCAAGA ATTAATGACT 11160 

GAGTTTTCGT TAGATCTTCT TCTAAAAGCC ATTAAACTAG CTCGTTGGAC CTACTACTAT 11220 

CACTTGAAAC AGCTAGATAA ACCAGATAAG GACCAAGAGC TTAAAGCTGA AATTCAATCC 11280 

ATCTTTATCG AACACAAGGG AGATTATGCT TATCGCCGGG TTCATTTAGA ACTAAGAAAT 11340 

CGTGCTTATC TGGTAAATCA TAAAAGAGTT CAAGGCTTGA TGAAAGTACT CAATTTACAA 11400 

GCTAGAATGC GACAGnAACG AAAATATTCT TCTCATAAAG GAG 11443 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5338 base pairs 
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(B) type: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 



(xi> 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


50: 






CCAATTACAT 


TATATTATCA 


AAATCGTCGA 


AACTGGCTCC 


ATGAATGAGG 




fin 


ACTCTTTATC ACTCAGCCAA GTCTCTCCAA 


TGCAGTGCGA 


GATTTGGAAA 




izU 


CATTGAGATC 


TTTATCCGCA 


ATCCCAAGGG 


AATCACCTTG 


nL.^.v.vi J. vi v lu 


Vjw\1v>*jA\j J. 1 


180 


TCTCTCTTAT 


GCCCGTCAGG 


TTGTCGAGCA 


GACCCAGCTT 




/"'/Tin m* h x x n 

1 ATAAAAA 


240 


TCCTGTCGCC 


CACCGCGAAC 


TCTTTAGCGT 


TTCGTCTCAA 




TTGTGGTCAA 


300 


TGCCTTTGTC 


TCTTTGCTCA 


AGAAAAGCGA 


TATGGAGAAA 




rct-TTCGTGA 


360 


AACTCGGACT 


TGGGAGATTA 


TCGACGACGT 


CAAGAACTTC 




TCGGGGTCCT 


420 


CTTCTTAAAC 


AGTTACAACC 


GTGATGTTTT 


AACCAAGATG 


^- ITjIjA luALA 


ATCACCTGCT 


480 


AGCCCACCAT 


CTCTTCACAG 


CGCAACCGCA 


TATCTTTGTC 


Au*_A/WjAI_(_A 


ALLCTCTGGC 


540 


AAAGAAAGAC 


AAGGTGAAAC 


TGTCTGATTT 


GGAGAATTTC 


l X rtUL 1 LA 


(jL I A i GACC A 


600 


AGGGACGCAC 


AACTCCTTCT 


ACTTTTCAGA 


AGAGATTCTT 


IK, L LAAoAAL 


ACCACAAGAA 


660 


ATCCATTGTG 


GTCAGTGACC 


GTGCCACCCT 


CTTTAATCTC 


1 lu/i l IVjLj I J 


1V»L»A Ttjtj 1 1 A 


720 


TACCATTGCG ACAGGGATTT TGAACAGCAA CCTAAACGGA 




TTTCTATCCC 


780 


ACTGGATATT 


GATGACCCGA 


TCGAGCTGGT 


CTATATCCAG 


CATGAGAAAA 


V_V_/\.Vj^.V- liilL 




TAAGATGGGC 


GAACGCTTTA 


TAGACTATCT 


CCTAGAAGAA 


GTTCAGTTTG 


ATAGTTGAGA 


900 


AATGATAAGA 


ACCAATATGT 


AGGCTAGCAA 


CAACCTGCAC 


ATTGGTTCTT 


TTTACTTATA 


960 


ATTAAAAGTT 


TCCCCTGCCA 


ACTTATCAGC 


TAGCTTGGGA 


AAGAGAGTAT 


AAAACTTATG 


1020 


GGCTAGGTTC 


AACAAAATCG 


GGAGATTGAG 


TTCTCGTTTG 


TTTTTTCCTA 


TAATCTTGAC 


1080 


AATCTTTTTA 


GCCACTGCAT 


CTGGTTCTAG 


CAGGAAGCGA 


TCAACCGATT 


TAAGATAAGT 


1140 


TCCATCTGGG 


TCGGCTTGGT 


CGAAAAATCC 


TGTACGGATT 


GGTCCTGGAT 


TGACTGTTGT 


1200 


CACATAGACT 


CCATAGGGCA 


TAAGTTCGAG 


TCGCAGAGCA 


TTTGAAAAAC 


CAATAGCCGC 


1260 


AAACTTGGTC 


GCTGAGTAAA 


GACTAGACTT 


GCCAGTAGCT 


ATTAGACCTG 


CCATGCTGAC 


1320 


GATGTTGATG 


ATATGCCCTT 


TGCTGCTTTC 


CTTCATACGA 


GCCGCAAGGT 


GACGAGACAG 


1380 


ATTCATCAGG GCAAAGGTAT TGACCTCAAA CATCTGGTGA ATATCTTTAT 


CAGCAATCTG 


1440 


GTCAAATCCC 


TCAAAAATCC 


CGTAACCAGC 


GTTGTTAATC 


AAGACATCAA 


TCTTGCCATA 


1500 


GCGGAGATAA 


AGATCAGTTA 


CCAGAGCTTC 


TAGGGCTGAA 


TCGTCGGTAA 


TATCAATTTC 


1560 
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AATCAATTCT GCATGGGAAT AATTTCCGTA GAGTTGGGCT AATTTTTCCT TATTTCTACC 1620 

AAGCAAGATG AGTTGGTCAT TGGGCAGGAG TTTGACCATT TCTTGAGCTA GACCACCGCT 1680 

AGCTCCGGTA ATGAGAATAG TAGGCATACT TATCCTTTCT GTGACTGCTA GATTTCCACT 1740 

TCTTCCAAGT CTTTGACCAC ATGGACATTT TCAAAAATTG TGGCAGCGTC TTTCTTGAGT 1800 

TTGCTAATAT CTTTTGAGAG GAAACGGGCA CTGATATGGT TGAGTAGGAG GCGTTTGGCA 1860 

-CCTGCTTCTA CCGCTACTTG TGCAGCTTGC ATATTAGTTG AGTGACCATG GTTACGAGCA 1920 

ATTTTTTCAT CACCCTTGCC ATAAGTGGAC TCATGAACTA GGACATCTGC ATTGACAGCC 1980 

AGACGCACAC TGGCACCCGT TTTTCGAGTG TCTCCTAAAA TAGTGATAAT CTTACCTGGA 2040 

CGTGGCGCTG AGATATAGTC TGCTGCCTTG ATTTCAGTTC CGTCTTCCAA AACAAGATCC 2100 

TGGCCGTTTT TGATTTTACC AAAAAGCGGG CCGAACGGAA CACCAGCAGC CTTGAGTTTT 2160 

TCAGCATCCA GCGTCCCTTC TAGATCCTTT TGCATGACAC GATAGCCAAC ACAGAAAATA 2220 

GTGTGGTCCA GCTCCTCTGC ATACACAGTG AATTTATCGG TTTCAAGAAT TTTACCCAGA 2280 

GAATCTTGGT CAAACTCATG GAAATGAATG CGGTAGGGCA GACGAGAACC TGACACACGA 2340 

AGGCTGGTTA AGACAAATGA CTTGATTCCT TGAGGTCCGT AGATTTCCAA ATCTGTCTGC 2400 

TCTTCATTGG CCTGAAAGGC ACGGCTAGAA AGGAAACCTG GCAAACCAAA AATGTGGTCT 2460 

CCATGCAGAT GGGTAATAAA GATTTTGCTG ACCTTACGTG GTCGAATTGT GGTTTCCAGA 2520 

ATGCGATTTT GCGTACCTTC TCCACAGTCA AAGAGCCAAA CTTCGTTAAT CTCATCCAAA 2580 

AGTTTCAGGG CGAGACTTGA AACGTTGCGG GCTTTAGAGG GCTGACCAGC CCCCGTTCCT 2640 

AAAAATTGAA TATCCATTCG ATACTTTCTA ATTAATCAAT ATATAACATG GCTGTGCGGT 2700 

TTTCCGATCG GAAATAGCGT TTGCCAGAAA AAGCAGCAGC TTCTTGCAAT AAATCCTCTT 2760 

GGCTGTAGCC TTTGAGACGT TTTCGACCAT CAGCCAATCT TTCCAAATCA GTCAAAGCTG 2820 

TGAGACTTTC TAGGCTGATA ACTTCCTCGT CCTCGACAGG CTTCATGTAA ATCTTACCAG 2880 

ACTCTTCAAA GACTAATTGA TGGGGGAAAA TTTGCGCAAT TTCAAAGAGC AAGTCATCCG 2940 

AGATTTTCTC CTCATTTTCA AAGAAAATCC GACCAAGGCC GTCACTCTCA TAACAAAAAC 3000 

CAAAGGATTT ACCAGACAGA TTAAGCCGAA TAAAAGGCTT ATTTTCTAGG GTGAAACTTG 3060 

GCTCAGTATT GTAAAGATTC AGTTCCTGAC TGAGTTCTGC AAAATAATCC GTCGCAGCCT 3120 

GAGGACTCTT TTTCTGATAG AGTTCTGCAA AGTAGGCATT AACAACACTT GGCGGAGGTG 3180 

TAATAAGTGT TAACTGCTCC TGATCTGTTT TACCAGCTAG AAGCTGATCC AGATAGACCT 3240 

TGTCCAGACT TGTATAACCT CCATACTTTA GAGCCAAAGT TTTAATATCA GTCATAAAAT 3300 
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TCTTCTAACC TCCATTTATT TTTCTCGGAA ATGTAGCCTG TAATCACTTC GCCGTCTTCC 3360 

TGATAATCAC GTTCTTCCAG AATTGCAACA CTCTCTAAAT CATGAATCTT GTAGGACTTT 3420 

GAAAAAGGCA CTCGCAGGGT AAATGCTTCA AAAATTTCCT TAATCTTATC TAGCAATAAT 3480 

GCTTGCAAGT TTTCACGACT GTCCTCAGAC TTGGCAGAAA TGAGGGTATA TGGCGTTTGG 3540 

GTAGGCGTGA AATCCTCCAC CAAATCCGCT TTATTATAAA GCGTCAAGTG AGGAATATCT 3600 

TCCATGTCCA GGTCTTTCAT GATGGAGAGA ACCGTTTTTT CATGCTCCTC GTGGTAAGGA 3660 

TTGCTAGCAT CGATAACATG AACCAGAAGG TCCACATGCT TGCTTTCTTC CAAGGTTGAC 3720 

TTGAAACTGG ACACCAACTC TGTCGGCAAA TCTTGGATAA AGCCAACGGT ATCTGTCAAA 3780 

GTTACTTGGA GATTGCCTCC CAGATGAATA CTCTTGGTTG TCGCATCCAG AGTCGCAAAG 3840 

AGCTCATCTG CTTCATACTG GGTCTTACTG GTCAAGATGT TCATGATAGT TGATTTCCCA 3900 

GCATTAGTAT AACCAATCAA ACCAATCTTA AAAGTGCTAG ACTCCAAACG TTTTTCTCTG 3960 

ACAGTCGCAC GATTTTTCTC AACCACCTTG AGCTGGCGCT CGATATCCGT GATTTGATTG 4020 

CGAACGCTAC GACGGTTCAG CTCCAGCTGG CTTTCACCAG GACCACGGGA ACCAATTCCC 4080 

CCTgCCTGAC GGCTGAGCAT AATCCCCTGA CCAACCAAGC GAGGCAAAAG GTATTTGAGT 4140 

TGGGCTAGGT GGACTTGGAG CTTCCCTTCA TGGCTTCGAG CCCGCATGGC AAAGATATCC 4200 

AAAATCAACT GCATACGGTC AATGACCTTA ACACCGAGAA CTTCCTCTAG ATTGACATTC 4260 

TGCCTTGGGG TCAGACGATT GTTGACGATG ACAGTAGTGA TTTCTTCTGC ATCCACCATA 4320 

AGCGCAATCT CTTCCAACTT ACCAGAGCCG ACGAAGGTCT TGGAATCATA TTTTTCACGT 4380 

TTTTGTCTGT AGCTATCTAC AACGACTGCC CCTGCCGTTT TCGCTAAACT AGCCAATTCT 4440 

TCCATGGAGA GGTCAAAACT GTCCATACCC TGCAATTCCA CACCAATCAG CAGGACTCGC 4500 

TCCTCTTTTT TCTCCGTTTC AATCATCTAA AAACTCCTCT ATCTGGCTTA AAATGCGGTC 4560 

TTGTACACCA GATTCTCCAA TCTGATAAAA GGTGACCTGC ATGCGATTAC GGAACCAGGT 4620 

CAGCTGACGC TTGGCAAAAC GACGAGTCGC CTGTTTAAGA CTCTCACTAG CTTCCTCCAA 4680 

GGTCTGCTCT CCACGGAAAT AAGGAAAGAG TTCCTTATAG CCAATTCCTT TAGCAGCCTG 4740 

TACATTAGGG GAATGGTCAA ACAGCCACTT GGCCTCATCC AAAAGCCCAG CCTCAAACAT 4800 

CAAATCCACT CGGTGGTTGA TACGCTCATA AAGTTGACTA CGTTCATCAT CCAAGCAGAT 4860 

AATCAGCGGT TCATACAAGG TCTCTTGATT TTCCAAATCC TGACCAAAAT GGGCAATTTC 4920 

TAAGGCACGC ATAGCACGAC GACGATTAAA CTGGGGAATC TCAAGGCCTG CTTGATCCAC 4980 

CAAATGGGCT AATTCCTCAT CTGAATATGG CTCCAAACTA GCTCGATAAG CTAAAATCTC 5040 

CTCATGAGGA GTCTCCCCAC CTAGGTGGTA ACCTTCTAGC AAGCTCTGGA TATAAAGTCC 5100 
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AGTCCCACCG GCGATAATGG CTAGCTTGCC ACGGTTGTGA ATACCCTCAA TAGTCATCTT 

AGCTTCTGAA ACAAAATCAA AAGCCGAGTA AGACTCGGTT ATCTCTCTAA CATCGATTAA 

ATGATGAGGA ACAGCTGCCT GCTCTTCTGG ACTAGCCTTG GCCGTCCCAA TATCAAGTCC 

TCGATAGACT TGCTGGCTAT CTCCACTAAC CACTTCGCCA TTAAAACGCT TTGCGGGG 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEOUENCE CHARACTERISTICS .- 

(A) LENGTH: 19446 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



5160 
5220 
5280 
5338 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
CGGAAACCCA TCTAGTCTCC ATCGTTTGGG AGACCAAGCA ACACGAATCT 
TCGCCAACAG ATTGCAGATT TAATCGGTAA GAAAAGCGAT GAAATCTTCT 
TGGAACAGAA GGGGATAACT GGCTTATCAA GGGTGTGGCC TTTGAAAAAG 
CAAGCACATC ATTGTTTCAG CCATTGAACA TCCAGCAGTC AAAGAGTCAG 
GAAAAGTCAA GGATTTGAAG TGGATTTTGC TCCAGTTGAT AAGAAAGGCT 
TGAGGCGTTA CAGGTTTGAT ACGGCATGAT ACAATCCTCG TTTCCATCAT 
AATGAAATCG GCTCTATCCA ACCTATTGAG GCTATTTCAG AATTCTTGGC 
ACTATTTCCT TCCACGTTGA TGCGGTTCAG GOGCTTGCCA AAATTCCGAC 
CTGACAGAAC GGGTGGATTG CGCGACTTTC TCTAGTCACA AGTTCCACGG 
GTTGGCTTTG TCTATATCAA ATCTGGCAAG AAGATTACAC CTCTTCTTAC 
CAGGAGCGAG ATTATCGTTC GACAACTGAA AATGTGGCAG GGATTGCAGC 
GCCCTCCGTT TGTCTATGGA AAAGCTAGAT ATCTTTAGGA GCAAGACTGG 
GCAGTGATTC GCCAAGCTCT TCTGAACTAT CCGGATATTT TTGTCTTTTC 
AACTTTGCAC CTCATATTCT GACTTTTGGA ATCAAAGGTG TTCGAGGTGA 
CACGCCTTTG AAGACTATGA TATTTTCATC TCAACAACCT CAGCTTGTTC 
GGAAAACCAG CCGGTACCTT GATTGCCATG GGAGTGGACA AAGATAAGGC 
GTGCGTCTTA GCCTAGACTT GGAAAATGAT ATGAGTCAGG TCGAGCAGTT 
TTAAAATTGA TTTACAATCA AACTAGAAAA GTAAGATAGG AGCATTCATG 
AAATTATGAT TCGCTACGGA GAGTTGTCAA CCAAGGGTAA AAACCGTATG 



TAGATGCTTC 
TTACCTCGGG 
CTCAGTTTGG 
CCCTCTGGTT 
TGGTCGATGT 
GGCTGTGAAC 
AGACAAGCCG 
TGAAAAGTAT 
GG TTCGAGGT 
AGGTGGTGGC 
GACAGCCAAG 
GCAGATGAAG 
AGATGAGGAA 
AGTCATCGTT 
ATCTAAGGCA 
CAAGTCAGCT 
TTTGACCAAG 
CAGTATTCAG 
CGTTTCATCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
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ATAAACTTCG 


TAATAATATT 


TCGGACGTTT 


TGTCTATCTA 


TACCCAAGTT 


AAGGTAACAG 


1«UU 


CAGATCGCGA 


CCGTGCCCAC 


GCTTACCTCA 


ATGGAGCTGA 


TTACAPAGPA 


GTTGPAGAAT 


1 0 API 


CTCTCAAACA 


AGTTTTTGGA 


ATTCAAAACT 


TTTCTCCTGT 


TTATAAGGTT 


fiAAAAATPTH 


X J £\J 


TAGAAGTTTT 


GAAGTCTTCT 


GTCCAAGAGA 


TTATGCGGGA 


CATCTACAAG 




lion 


CCTTTAAGAT 


TTCTAGCAAG 


CGTAGCGACC 


ACAACTTTGA 


ACTTGATAGT 

n\« X X uA 1 raw X 


PGTGAAPVPA 




ACCAAACACT 


TGGAGGGGCT 


GTATTCGAAG 




TGTGPAAGTT 


wvinlwUUin 


X jUU 


GTCCTGACAT 


CAATCTTCAG 


GTGGAGATTC 


GTGAAGAAGC 


AGPCTATPTT 


TPTT ATV"; A A A 




CCATTCGTGG 


GGCTGGTGGT 


TTGCCAGTTG 


GAACTTCAGG 


TAAAGGGATG 




1620 


CAGGAGGGAT 


TGACTCACCT 


GTAGCAGGTT 


ATCTTGCTCT 


TAAGCGTGGG 


GTGCATATPP, 


1DOU 


AGGCAGTTCA 


CTTTGCTAGT 


CCACCATATA 


CTAGTCCTGG 


•PGPCCPPAAG 


AAAGPGPAPP, 


1 1 Aft 


ACTTGACCCG 


TAAATTGACC 


AAGTTTGGCG 


GAAATATPPA 


fiTTT » m a n a 


rtTflPPTTTP A 


1 OPiPi 
lOUU 


CAGAGATTCA 


AGAGGAAATC 


AAAGPPAAAfl 


v_VJ L. vnunnu I— 


, I W PA1 W MVJATV1 


AL. X AAU 1 t_ 


1 O £ A 


GTCGCTTTAT 


GATGCGGATT 


AOTGACCGTA 


TTPGTGAGGT 


APP.AAATTIGT 


1 1 UU 1 1 A 1 V_ A 




TCAATGGGGA 


AAGTCTAGGT 


CAAGTAfiPPA 






nnuuL 1 A i t. A 


1980 


ATGCTGTTAC 


CAACACTCCC 


n 1 V— .rt. i i. \-.o 1 V. 


XVJ lUU 1 1 Ak. 


V_A 1 AL. AA Vj 


1 IajoAAA tca 


2040 


TTGACATCGC 


CCAGGAAATC 


vJ/\ 1 Al>l« X X X\J 


rtv-rt 111 V^rir\ 1 


LLnnLLu ill 




2 100 


GTACCATTTT 


TGCACCAGAT 


P.GTCCAAAAA 


PAAATPPTAA 
V-AAA 1 <~v_ i AA 


AATTAARAAT 






ACGAAGCGCG 


TATGGATGTT 


GAAGG CTTGG 


TTGAGCGAGP 


AGTGGPTGGA 


ATP ATT2 ATT A 
A 1 l»M lun 1 1 A 


zz/u 


CTGAAATCAC 


ACCTCAAGCC 


GAAAAAGATG 


AAGTTGATGA 


CTTGATTGAP 

V» X X X X VJ^w 




2280 


AATTCAGAAA ATCCAAAAGA 


ATAGCGAAAA 


TCAGTAAAAA 


x x *»vj xxx 


TTTCTCTAAA 


2340 


AACAGGTAAA AAACTAACTT 


TT TTT ATTTT 


TATGATATAA 


TGATATAAAA 


TTTTGAATAT 


2400 


AGAGAGTTTT 


CTGACAATGA 


ATCAATCCTA 


CTTTTATCTA 


AAAATGAAAG 


AACACAAACT 


2460 


CAAGGTTCCT 


TATACAGGTA 


AGGAGCGCCG 


TGTACGTATT 


CTTCTTCCTA 


AAGATTATCA 


2520 


GAAAGATACA 


GACCGTTCCT 


ATCCTGTTGT 


ATACTTTCAT 


GACGGGCAAA 


ATGTTTTTAA 


2580 


TAGCAAAGAG 


TCTTTCATTG 


GACATTCATG 


GAAGATTATC 


CCAGCTATCA 


AACGAAATCC 


2640 


GGATATCAGT 


CGCATGATTG 


TCGTTGCTAT 


TGACAATGAT 


GGTATGGGGC 


GGATGAATGA 


2700 


GTATGCGGCT 


TGGAAGTTCC 


AAGAATCTCC 


TATCCCAGGG 


CAGCAGTTTG 


GTGGTAAGGG 


2760 


TGTGGAGTAT 


GCTGAGTTTG 


TCATGGAGGT 


GGTCAAGCCT 


TTTATCGATG 


AGACCTATCG 


2820 


TACAAAAGCA 


GACTGCCAGC 


ATACGGCTAT 


GATTGGTTCC 


TCACTAGGAG 


GCAATATTAC 


2880 


CCAGTTTATC 


GGTTTGGAAT 


ACCAAGACCA 


AATTGGTTGC 


TTGGGCGTTT 


TTTCATCTGC 


2940 
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AAACTGGCTC 


CACCAAGAAG 


CCTTTAACCG 


CTATTTCGAG 


TGCCAGAAAC 


TATCGCCTGA 


3000 


CCAGCGCATC 


TTCATCTATG 


TAGGAACAGA 


AGAAGCAGAT 


GATACAGACA 


AGACCTTGAT 


3060 


GGATGGCAAT 


ATCAAACAAG 


CCTATATCGA 


CTCGTCGCTT 


TGCTATTACC 


ATGATTTGAT 


3120 


AGCAGGGGGA 


GTACATCTGG 


ATAATCTTGT 


GCTAAAAGTT 


CAGTCTGGTG 


CCATCCATAG 


3180 


TGAAATCCCT 


TGGTCAGAAA 


ATCTACCAGA 


TTGTCTGAGA 


TTTTTTGCAG 


AAAAATGGTA 


3240 


. AGTTAAGAAA 


GGAAAAAACG 


AAATGCATAT 


TGAACATCTT 


AGCCACTGGA 


GTGGTCATCT 


3300 


TAACCGTGAA 


ATGTACCTTA 


ACCGTTATGG 


ACATGGTGGG 


ATTCCAGTTG 


TGGTCTTTGC 


3360 


TTCATCAGGT 


GGTAGTCACA 


ACGAATACTA 


TGATTTTGGC 


ATGATTGATG 


CCTGTGCTTC 


3420 


CTTTATCGAG 


GAAGGCCTTG 


TCCAGTTCTT 


TACCCTATCT 


AGTTTGGATA 


GTGAGAGCTG 


3480 


GTTGGCTACT 


TGGAAAAATG 


CTCATGACCA 


AGCGGAAATG 


CACCGTGCCT 


ACGAACGTTA 


3540 


TGTGATTGAG 


GAGGCCATTC 


TTTTATCAAG 


CACAAGACAG 


GTTGGTTTGA 


TGGCATGATG 


3600 


ACGACAGGTT 


GCTCTATGGG 


AGCCTATCAT 


GCACTCAATT 


TCTTCCTCCA 


GCATCCAGAT 


3660 


GTCTTTACCA 


AAGTGATTGC 


TCTCAGTGGT 


GTTTACGACG 


CACGTTTCTT 


TGTCGGTGAT 


3720 


TACTACAACG ATGATGCTAT 


TTACCAAAAC 


TCGCCAGTAG 


ATTATATTTG 


GAACCAAAAC 


3780 


GACGGCTGGT 


TTATTGACCG 


TTACCGTCAG 


GCAGAGATTG 


TGCTGTGTAC 


GGGGCTTGGA 


3840 


GCCTGGGAAC 


AAGATGGTTT 


GCCATCCTTT 


TACAAGCTCA 


AAGAAGCCTT 


TGACAAGAAA 


3900 


CAAATTCCAG 


CCTGGTTTGC 


TGAATGGGGA 


CATGATGTCG 


CCCATGACTG 


GGAATGGTGG 


3960 


CGTAAACAAA 


TGCCTTATTT 


CCTCGGTAAT 


CTCTATTTAT 


AAAAGGAGTT 


ACCTATGAAT 


4020 


TACCTTGTTA 


TTTCTCCCTA 


CTATCCACAA 


AACTTTCAAC 


AGTTTACCAT 


CGAACTAGCT 


4080 


AATAAAGGCA 


TCACAGTCTT 


GGGAATTGGT 


CAAGAGTCTT 


ACGAGCAATT 


GGATGAGCCC 


4140 


TTGCGCAATA 


GCTTGACCGA 


GTATTTTCGT 


GTTGATAATC 


TTGAGAACAT 


AGATGAAGTC 


4200 


AAACGTGCAG 


TTGCTTTTCT 


CTTTTATAAA 


CATGGTCCAA 


TTGGCCGCAT 


CGAGTCTCAC 


4260 


AATGAATACT 


GGCTTGAGCT 


AGACGCAACA 


CTCAGAGAAC 


AATTCAATGT 


1TTTGGTGCC 


4320 


AAACCAGAGG 


ATCTCAAAAA 


GACGAAATAT 


AAGTCTGAAA 


TGAAGAAACT 


TTTCAAAAAA 


4380 


GCAGGTGTTC 


CTGTGGTACC 


TGGAGCTGTT 


ATCAAGACGG 


AAGCAGATGT 


TGATCAAGCA 


4440 


GTGAAAGAAA 


TCGGTCTTCC 


AATGATTGCC 


AAACCTGATA 


ATGGAGTGGG 


AGCAGCCGCA 


4500 


ACCTTTAAAC 


TTGAGACAGA 


AGACGATATC 


AATCACTTCA 


AGCAAGAATG 


GGACCATTCA 


4560 


ACCCTTTATT 


TCTTTGAAAA 


ATTTGTCACT 


TCCAGCGAAA 


TCTGTACCTT 


TGACGGGCTC 


4620 


GTGGACAAGG 


ATGGAAAGAT 


TGTCTTCTCA 


ACAACCTTTG 


ACTACGCCTA 


TACACCGCTT 


4680 
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GACCTCATGA TTTATAAGAT GGACAATTCT TATTATGTGC TCAAGGATAT GGATCCTAAA 4740 

CTGCGCAAGT ATGGGGAAGC AATTGTCAAA GAATTTGGTA TGAAAGAACG GTTTTTCCAT 4800 

ATTGAGTTCT TCCGTGAGGG GGACGATTAT ATTACCATCG AGTACAATAA CCGCCCTGCA 4860 

GGTGGTTTTA CCATTGATGT TTATAACTTT GCTCATTCCT TGGACCTTTA TCGTGGCTAT 4920 

GCAGCTATTG TCGCAGGAGA GGAGTTCCCG GCGTCAGACT TTGAAACTCA GTATTGTTTG 4 980 

GCTACTTCTC GCCGTGCAAA TGCTCACTAT GTTTATTCAG AAGAGGATTT GCTTGCCAAA 5040 

TATAGCCAGC AGTTCAAGGT TAAAAAAGTC ATGCCAGCTG CCTTCGCGGA ACTTCAAGGA 5100 

GATTACCTGT ATATGCTGAC CACTCCGAGT CGACAAGAAA TGGAGCAGAT GATTGCAGAT 5160 

TTCGGACAAC GTCAAGAATA AGAACTATCG GATTAAGGAA ATTAACTCCC TTAATCCTTT 5220 

TGTTTTGTCT GATAAAAAAT AAGAGCATCC CAACAAGGTA GCTATCATAA AACTTGTTCG 5280 

ATAACTATTT GAAGCAGGAT TAGGTGGTCA GAAATTAAAT TTTAATATTT CAATTGAGTC 5340 

ATAGTATTGT GTTTGCGTAT CCTTAAATCA GCTAAAAGGA TCCATGACGA CACCTATACG 5400 

ATATAGTTTT CAAGATACCA AACAAGTCTA TTAATATTCA ATGAAAATCA AAGAGCAAAC 5460 

TAGGAAGCTA GCCGCAGGTT TCTCAAAACA CTGTTTTGAG GTTGTGGATA GAACTGACAG 5520 

AGTCAGTATC ATATACTACG GCAAGGTGAA GCTGACGTGG TTTGAAGAGA TTTTCGAAGA 5580 

GTATAAAATA TTCAGGTGAC GCATAGATAT AGTTAATTGA AGCTTTGTTT GAAATCTGAT 5640 

AAAATAATGA TATTACTAAG TTTTAAAAAC TAAAGAAAAG GGAAGATATG ATTACAGGCG 5700 

AATTAAAAAA TAAAATCGAT CAGCTGTGGG AAATTCTTTG GACAGAAGGA AACGCAAATC 5760 

CTTTAACAAA TATTGAACAG TTGACTTATC TCTTATTTAT GAAAGATTTG GATAGTGTCG 5820 

AGCTTGGACG TGAAAGTGAT GCTGAATTTC TAGGGATTCC TTATGAGGGA GTTTTTCCAA 5880 

AAGATAAACC TGAATACCGT TGGTCAACTT TTAAAAATAT AGGAGATGCT CAGGAAGTTT 5940 

ATCGTTTAAT GACTCAGGAG ATTTTTCCGT TTATTAAAAA TCTCAAGGGG GATACAGATG 6000 

ATACAGCCTT TTCACGATAT ATGCGAGAAG CTATTTTTCA AATAAATAAA CCTGCTACGC 6060 

TTCAAAAGGC AATTTCTATC TTAGATGTTT TTCCAACTAG GGGATTAGAT GTAGATTTTG 6120 

ATAATGACAA ACAAAGTATT ACTGATATCG GAGATATCTA TGAATATCTG TTATCAAAAT 6180 

TGTCGACCGC AGGTAAAAAT GGACAGTTCC GTACACCTCG TCACATCATC GATATGATGG 6240 

TTGAGTTGAT GCAACCGACT ATCAAAGATA TCATCTCAGA TCCCGCTATG GGTTCTGCTG 6300 

GCTTCTTAGT ATCTGCTAGC CGTTACTTAA AGCGTAAGAA AGATGAATGG GAAACCAATA 6360 

CAGATAATAT CAATCATTTT CATAATCAGA TGTTTCATGG AAATGATACG GATACGACTA 6420 

TGTTGAGACT TGGGGCGATG AACATGATGC TACATGGAGT AGAAAATCCA CAAATCAGTT 6480 
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ACCTTGACTC 
ATCCTCCTTT 
TAAAAACCAA 
GTGGACGAGC 
AAGGAATTCG 
" GTGGTGTGTT 
GTAATGGTGG 
ATGATAAGCG 
ATCTTGAAAA 
AGATAAAGGA 
AAGTTGAGTA 
TTCAAGCTGG 
AAGTGAAGTT 
AACAAACAAC 
TAAAATTCAC 
GGGATGGAGC 
TTACGGTCTT 
TCTTTTTGGA 
ATTTAAACAA 
AGAACATTAT 
TAGATGAACT 
TATTTGAAAG 
CTAAATCAGA 
CTAAAAACGG 
TTCGAAAAGG 
ATGTAGCGTA 
TAATATTACG 
ATAATAATTA 
TAAAAAAAAT 



GCTGTCTCAA 
TAAGGGCTCA 
AAAAACAGAA 
AGCAGTTATC 
TCAGGAAATT 
CAAGCCTTAT 
TACTGACAAA 
ACAACCGATT 
AGAAGCAGAA 
AAATGATTAT 
TGAACCAACA 
CTTGGCTGAA 
GGGGGAAGTC 
TCTAAGCCAA 
TGAAAGTTTA 
TAATGCAGGA 
AAAAAAGAAT 
AAGTAAATCG 
GAATATATTA 
CTGTATTCTT 
AAACTTGCTC 
CATTGATAAC 
TGAGTTGTTT 
ATTTTCATTC 
CAAACTTGAG 
CTACGATGAA 
TCCCAAGACA 
TAGTCGAGTG 
ACTTCTCCCC 



GATAATGAAG 
CTTGACTACA 
TTACTCTTTC 
GTACCTGATG 
GTAGAGAATC 
GCTGGAGTTT 
GTCTGGTTTT 
AGCGACAATG 
CGTCAGAGAA 
GATTTGTCTA 
GAAGTCATAT 
TTGGAAAAAT 
TTATCTCTAA 
CGTTATATTC 
AATATGACTG 
ACAGTTGGTT 
GAGCGATACA 
CAGTATTTAC 
CTTGATTTAC 
AATACGATTA 
GTCAAATCCC 
TTATTTGATA 
AGTGAGGAGT 
GATACAAAGC 
CGTTATGATA 
TTAATAAAAT 
CCAAATCTAA 
ATATCAGGAA 
CTCCCCCCAC 



459 
AAGCCGATAA 
ATTCAACCTC 
TTTCTCTTTT 
GTGTCCTTTT 
ATAAGCTTGA 
CAACTGCCAT 
ACGATATGAA 
ATATTCCAGA 
CGGATCAATC 
TCAATAAATA 
TAAAGAAAAT 
TACTCAAGTA 
AAAAAGGCAA 
AAATAGATGA 
AAGCACTCCC 
ATGGATTATC 
AAGAAAAAAT 
GAGATCATTC 
AATTAGAATT 
AAAGGCTTAT 
GATTTAACGA 
TTATAGATGG 
ACTGTTTATT 
AATTTATCAC 
TAGTCTTGAC 
ATAAACATTT 
ATCAGAAATT 
GTGCTCAGCC 
TAGCCCTCCA 



ATATACTTTG 
TAATGACCTT 
CTTGCGAACT 
TGGTTCGTCT 
TGCTGTAATC 
TCTCATCTTT 
AGCGGATGGT 
TATTATCGAA 
TTTCTTTGTT 
TAAAGAGATT 
CAATGATTTA 
GGGAGGTGGC 
GAAAGCCACT 
TTTAAGAAAT 
AGATGATATT 
GGGAGCTGTT 
TATATCAGAT 
AACAGGTGCA 
GCTAGGTATC 
TACTAAAAGA 
GATGTTTGGG 
TGATAGGGGC 
TTTAAATACA 
TAAAACAAAG 
AACAAGAGGT 
ACGTATAAAT 
TATTATCCAT 
TCAGTTACCA 
AAATGAGTTC 



GTTTTAGCAA 
CTTGCAACCG 
TTAAAACCAG 
AAAGCTCATA 
TCAATGCCTA 
ACAAAAACTG 
TTAAGTTTGG 
CGCTTTCATC 
CCAGTTGCTG 
GAGTATGAAA 
GAAAAAGAAA 
TGTATGAAAA 
GTACTTGCTG 
AATAATAATT 
CTGATAGCAT 
GGTAGTACAA 
TACTTGGGAG 
ACAATTCCTC 
GAAGAACAAG 
AAATTTCAGT 
GAAAATAAAA 
AAAAATTATC 
AAGAATGTTA 
GATAAATTAC 
ACTGTTGGAA 
TCAGGTATGG 
GTTTTAAGGA 
ATTACAAAAT 
GCAGACTTTG 



6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 
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TAGTCCAGGT CGACAAATCA CAATTTGCTT GTGAGATAGC TATAAAAGTG TGGAGAAATA 8280 

GCTTGAAATT TAGTATAATA TAGCTAAACT ATTTGTTTAA AGTGAGAAAA AAATGGGAAA 8340 

TTTTAGCTTT CTTTTAAAAA ATGACGAATA TGAATCTTTT TCAAAACCTT GCATTGAAGC 8400 

TGAGAATATG ATTGCTACAT CAACTGTGGC TACTGCCTTT ATGGCGCGTC GTGCTTTAGA 8460 

GCAGGCTGTC CATTGGATAT ATAGTCACGA TTCATATTTA GAAGCTCCCT ATCGTGCTAC 8520 

TCTATCTTCT TTAGTATGGG ATGATGATTT TAGGGATATC GTAGATTCTG AACTCCACAA 8580 

GCAGATAGTT CTGTTGATTC GGTGGGGAAA CCATGCTGCT CATGGTGGTG AAATTAAGGA 8640 

ACGAGAAGCG ATTTTAGCTT TGCATCATTT GTATCAGTTT GTTAATTTTA TCGATTATTG 8700 

TTACAGCAAT GAGTTTGTGG AGCGTTATTT TGATGAGAAG TGCTTACCAC TTTCAGCAAA 8760 

CATCAAATAC CGAGAAACTC CACAATCTAT GATAAAGTTA CAAGACAGTT TACCAGAACT 8820 

GCCTGATTTT CATGAACAGA TGGCTGCTCA GTCCGTAGAA GTTCAAGAGA CTTATACTGA 8880 

AAAACGTGAG ACTGCAGCGC AACGGCAAGA TGTGCCTTTC CATATTGATC AATTATCTGA 8940 

GGCAGAGACA AGAAAGCTCT TTATTGATAT CGATCTCCGT TTAGCAGGAT GGATATTTGA 9000 

AGAAAACTGT CGTGTTGAGA TAGCCGTTGA TGGTCTCAAG CACGGTTCAG GAATTGGTTA 9060 

CTGTGACTAT GTACTTTATG GTAAAAATGG GAAAATTTTA GCGATTGTGG AGGCTAAAAA 9120 

AGCCTCTGTC AATCCAGAAG TAGGGGAAGT ACAGGTCAAA GAATATGCTG AAGCTTTGGA 9180 

GAAACATATC GGCTATCAGC CAATTTGCTT TATTACAAAT GGGTTGAAGC ACTATATACT 9240 

TGATGGTCCG AACCGCCGCC AGATTGCAGG CTTTTACTCT CAAGAAGAAT TGCAATTAGT 9300 

GATGGATAGA CGTCATCTTC AAAAACCGCT TGAGGATATT TCTAGTAAAA TTAGGGACGA 9360 

TATTTCCGGG CGTCACTACC AAAAACATGC CATTGCAAGC GTTTGTGAAG CTTTCTCTGA 9420 

TCATCGTAGA CAGGCACTTT TGGTTATGGC AACTGGGGCG GGGAAAACTC GTACAGCAGT 9480 

TTCTCTAGTT GATATCTTAT CACGTCATAA CTGGGTAAAA AACGTTCTCT TCTTAGCCGA 9 540 

TAGAACTTCC TTGGTTAAGC AAGCATATGA TTCGTTTAGA AAATTACTCC CAGATCTTTC 9600 

CGTTTGTAAC TTCTTAGAAG ATAAAGAAGG AGCTCAATCA AGTCGCATGG TCTTTTCAAC 9660 

TTATCCGACC ATGATTGGAG CGATTAGTGG TCAAGAAGAA GTAAATCAAC GCCCTTTCAC 9720 

TGTTGGGCAT TTTGACCTTA TCATAATTGA CGAATCTCAC CGTTCTATTT ATCAGAAATA 9780 

CAAGTCCATT TTTGATTATT TTGATGCAAG AATTGTAGGC TTAACAGCTA CTCCGCGTCA 9840 

AGATTTAGAT AAAAACACCT ATGGATTCTT TAATTTGGAG AATGGGGTTC CAACATATGC 9900 

ATATGATTTG GAAGAGGCTG TTAAAGACGG ATATTTAGTA GCCTATCATT CTATCGAAAC 9960 

CAAACTGAAA CTACCTACGG ATGGTCTACA TTATGATGAT TTGTCCGAAG AAGAAAAGGA 10020 
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ACATTTTGAT AGCAAATTTG AAGACAATAG CTGTGAAAAA GATATTGATG GGAGTGTATT 10080 

TAATTCCTTT ATTTTCAATA AAAGTACAGT AGAAATTGTT TTAAATGAAC TCATGACAAG 10140 

AGGAATTCAG ACAGCCTCGG GTGATGAAAT TGGTAAAACT ATTATTTTTG CTAAAAATCA 10200 

TGATCATGCG GAATATATCA GAGGTATTTT TAACAACCGC TATCCTGAAA AAGGGAGCGA 10260 

CTATGCTCAG GTGATTGATT ATAGTATTAA GCATTATCAG ACCTTGATTG ATGATTTTAA 10320 

AATTAAGGAG AAGTATCCTC AAATTGCGAT TTCTGTCGAT ATGTTAGATA CAGGTATTGA 10380 

TGTACCAGAG GTTGTTAATT TAGTCTTCTT CAAGAAAGTA CGCTCTAAAA CTAAGTTTTG 10440 

GCAGATGATT GGTCGAGGAA CCCGTCTATG TAAAGATTTA TTTGGACCTG AGCAGGATAA 10500 

GGAAAACTTC TTGGTATTTG ATTATGGGGA CAATTTTGAT TATTTTCGTG CAGATCCAAG 10560 

AGATGGAGAG GGTCGTCACA TTGTTTCGCT GACTCAGCGT TTATTTAATA TCAAAGTGGA 10620 

CTTGATTCGA GAACTTCAGG GACTCCAATA CCAAGAAGAT CAGTTTGCGA GAGCATACCG 10680 

TCAGCAGCTT GTCTCGGAAC TTCAAGGTCG TATAGAGAGC TTAAATGAGT TGGACTTCAG 10740 

GGTTCGTATG GTTTTAGATA CAGTTTATAG CTATAGGAAA TTGGAAAGTT GGCAGAATCT 10800 

AACTGCTGTT ACAAGTGAAA CCATTCAAAA AAATCTCTCT CCGCTTTTAT TTGATGAAGA 10860 

TAAAGAAGAT GAGATGGCGA GGAGATTTGA TTTGTGGTTG CTTCATATTC AGTTGGGGCA 10 920 

ACTGACAGCT AAATCTTCCA CTGTTCATAT TTCCCAAGTG ATGAAGACGG CTAGAGCTCT 10980 

TTCTGCTATT GGCAATATCC CGCAGGTTTT TGAGCAGGCT GAAATTATCA GGAAAGTACA 11040 

GGAGCCTGAA TTTTGGAAAG AAGTTAACTT GTCTGATTTG GAAAAAATTC GTCTTGCTAT 11100 

TCGAGATTTA TTACAGTTTT TGGATAAAAC AGACCGTAAA CCCTACTATG TTAACTTTGA 11160 

AGATCGTATA CTCTCCACTG TTCACGAGAC CACAGCATTT TTGCAGGTCA ACGATCTTCG 11220 

GTCTTACAAT GAAAAAGTTG AGCATTATTT GAAAACTCAT CTGGATGAGG AGTCCATTTC 11280 

TAAGCTATAC CATAATAAAA AGTTGACATC TGATGATATG CTTGCACTTG AAAAATTGCT 11340 

TTGGGAAAAA TTAGGTAGTA AAGCAGACTA CCAAAGTCAT TATGAAAATA AGGCAATTCC 11400 

GAGATTGGTT CGTGAGATTA TTGGCTTAGA TAGAGAGTCT GCCAATCGTA TTTTTTCTAA 11460 

ATTTTTGTCG GATGAGAATC TTAATGCCAG GCAGATTTCA TTTGTAAAAT TGATTGTAGA 11520 

CTACATTGTA GAAAATGGTT TTTTAGAGAC GAAAGTGTTA ACGCAAGAGC CGTTTAAATC 11580 

TTATGGTTCT GTTCAACTAC TCTTCCAACA CCAACTACCA GTACTTCGTA ATATTGTTCA 11640 

AATCATTGAA CTTATCAATA ATCGAGCTGG AGAAGCGGCT TAAATTCTAA AGTGATTGCC 11700 

ATGCTGAGAC TCATTTAAAA TTAAAAAGAG TAGAAATTTA TGCTATATAT GAGAAGTTTT 11760 
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ATTAGGAAGA 
TTTCAAAGTA 
TTTGATTTTA 
GAAATAATAC 
ACACTGTTTT 
AAATCTTCCT 
GCGTTTTCTG 
GTCATTTATT 
TTTTGAGCAG 
GCGATAGCGT 
GGAAGGGTTT 
ATGTAGAGTT 
GCGCTATCTT 
GCCTGGGTTA 
ATCTTGTTTT 
ATAATGTAAC 
GTGGTAGTCT 
GAGAGTGTCT 
CCATTCATCT 
GTAACCTTTT 
CAACATGTGA 
TAGGTATTTC 
GGATTTGATA 
AAGGTTAATA 
TGTCTTACCA 
TTGACCAAGT 
AGTCTGTGGA 
TCCTAGTAAT 
ATTGTTCATG 
AAGAGCATGG 



ATGTCATCGT 
GATACTTGTA 
AAGAAATAGA 
TCAATGAAAA 
GAGGTTGCAG 
AGGATAAAGC 
ATGTTAAAGA 
ATTCTTCAAA 
TATCTGCATC 
CAGGGAAGCT 
TTAGTAGGAC 
CGAGACGTTT 
CTTCGCGGAC 
CTTGAATGTC 
TTGTGATAAG 
TCCTTTTAGC 
GTATTGTCAC 
CCTTCTTTTA 
GCAGTAATAT 
GTCACGACTC 
TAGCGACCGT 
TTTTGCCCAA 
CGAAGATTGA 
GTAATCTGAT 
TAGCGCTCAA 
GAGAAGGTTT 
GCTGTATTGT 
TTCTTGGTCA 
TAAACATGGG 
AACATCTGGT 



TTTCCTAGAA 
CCACGATGTT 
AAACAAAAAG 
TCAAAGAGCA 
ATGGAAGCTG 
AAAACGCATA 
CTTTCTACCA 
GAAAAATGGT 
TTCACAGATG 
CAAAGTATCA 
TTGGTGTTGA 
TTCCCATTTT 
TTTTGATAGG 
GTTCTCAGCA 
AGCGCGTATA 
AAGGTAAGGT 
GGATGGTGAT 
GCTTGTAAAT 
CCTCTACTAA 
GTTGTGCCTC 
TGTCAATGGT 
ATTCTAGCAG 
GCTGAGCGCG 
TGGCTTTCCC 
AATCCTTATC 
CACAGTTTTC 
CCTGCCAGAT 
ATAAGGCATT 
CACCATTATG 
AAACATGAAG 



462 
TACAGTATCA 



TGTTGATCGA 
CCGAGCAAGA 
AACTAGGAAA 
ACGCGGATTG 
GTATCAAGGG 
GGTTTTTTAA 
GGGGCGAATT 
ATAAGACAGA 
ATGATAGAAC 
ACTGGGCGCA 
GAGATGGAAC 
TTCATATTTT 
AGAAGGGCTT 
AGTTGGTGGC 
AAGCATGGAC 
TTCAAAGTCA 
AGTTGGCTGC 
CAGTAAATCA 
TGGTCTAAAT 
TCTAGCACTT 
TTGGGCAGAT 
ACCGTTTAGG 
TTGATAGAGC 
TGGGTACTGG 
TTGCCCACCG 
AACAGTAGGA 
TATGGACTCA 
GAAAAAGAGA 
TGGTTTGACA 



GTTGTTAAGT 
GTTATTAACA 
ATTCAATTGC 
CTAGCTGCAG 
AAGAGATTTT 
TTTTCAACAC 
AAGCATAATT 
TTTTCAGTTC 
CATCATTACC 
CAAAGGATTG 
TCCAGACAAG 
CATTGTTAAG 
TGATGTCGCG 
GCAACTCAGC 
GGTGTTCTGA 
TGAGCGAGGT 
GTAGTATAGA 
AGTTCAAATT 
TTTCTATTTT 
GGCAATTCAC 
AAAATAGCTG 
AAGAGCCCCT 
TGGAGATCTT 
TCTGTATTGA 
TTTTGAATAG 
AAGTTATCAA 
AGTTGAAAGT 
CGGAAGTCAA 
TGCTTGTGTA 
TTCCAATCCT 



GGTTGATAAA 
AAAGAGCTAC 
AGGAGAAAAT 
GCTGCTCAAA 
CGAAGAGTAT 
TTGATACTAT 
GTTAGTTGTA 
TTCAAAGCAC 
ACAAAGGGTA 
AGCCAGTCCA 
GGCGTCTTCC 
AACATAATAA 
TGAGAGGGTT 
CTGTGTATGA 
TTTATTCATA 
CGACAGTCAA 
GGACTAAACG 
GAACGTCCAT 
GTAAATTAAG 
AGAGATTTTC 
GATAAGGTTG 
TGTTTGTACT 
TAGTCACAGG 
AGGTTTGGTA 
CTTGCTCTTC 
GTGATAACCA 
CTGTTTCCTG 
TTGATTGCCA 
TATGAGTAGG 
GAGAACCATG 



11820 

11880 

11940 

12000 

12060 

12120 

12180 

12240 

12300 

12360 

12420 

12480 

12540 

12600 

12660 

12720 

12780 

12840 

12900 

12960 

13020 

13080 

13140 

13200 

13260 

13320 

13380 

13440 

13500 * 

13560 
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AGTAAAGACA ACCTCTGCCT TTACTTTATG GGCATTGAGC AGATAATTGC GGTCATGCCA 13620 

AAACTGATTG TAGTCCCCAG TTTTTCGGTC TAGCTGAGCT TTCACTTTTT CTAAGTCAGC 13680 

TTGGTGAGCT TCATTGCCAC GGATATAGTC GCCAGCTAAG AGATTACGAG AATAGGTTAA 13740 

CTCAGCAAGG GAGTCAAAGT CCTCACCTGG ATAACCACCT GGGCTAGTCA CCAGACCGTT 13800 

TTCACGGTAG TAGTTGTACC ATGATGAAAT TCCTGCCTCG GCAATGATAA CTTCTAAACC 13860 

ATCGACTCCT GTAGTCGCAA GACCATTGGA CATGGTACCT AGATAGGAAA GTCCTGTTGT 13920 

AGCAACTTTT CCGTTTGACC AATCAGCCTT GACTTGACGC TGGCGCGTGT GATCAGTAAA 13980 

GGCACGGCAA CGACCGTTAA GCCAATCGAT GACATTTTTA TAAGCCTCGA TTTGCTGGTA 14040 

GTCTCCATTA GTCATGAAAC CTGTCGAGTC TTTGGTACCA ACACCTGAGA CATAGAGATT 14100 

GGCAAAGCCT CTCGGAAGGA AGTAGTCGTT TAGTGTATAG CTAGAGTTGA TGTGAGTTAG 14160 

CTTTTCCTCA GCCTCTGCTA TAAGCTCAGC TTTACCTTGG GGTTGGACGA GATTTAGTTG 14220 

AGGTTTCTCT AGCTCAATCT TGTGAGGAAG CTTAACCTCA AGCTCGCCCT CCATCTTGTA 14280 

GAGAGCCTTG TCACTAGCCT TGTCATTGGT TCCCTGATGA TAAGGGCTGG CTGTCATGAT 14340 

GGCAGGGATT TTTCCATCAA AACGAGGGCG AATAATGCTA ACCTTTACTA GGTCTGATAG 14400 

CCCTTTTTGG TCAGTATCGA CACGAGACTC AACGTAAACG ACTTCACGAA TGACATCCTG 14460 

GTTAGAAAAA GTAGCCAAAC TCTTGCCGTT AAAGTAGTGG TAGTCATTAT CCTCCGGAAT 14520 

AAGACCATCA CTAACAAGTT GGTCGATAAG AGTATTTCCT TTTTTGGTGC GAGTATTGAG 14580 

TAACTGATAG AGATTTTCAA TCAAGTCACC ATATATAATG GGAAATCCAG TTTCTTTACG 14640 

AAAAACGTCA CTATCTTCGA AGTCAACCAA ATAAGAAAAG CCTAAAAGTT GAAAAGCAAC 14700 

AGTATAAAAA ATATCTGCTG TCAGTTCATC TTCTGATTGA AAAAATGTCA GCAGGTCTGT 14760 

TTTTTTATCA GCTGCTAGGA TAGAAAGTGG GTAGTTGGTG TCTTGATAAG TGAAAAAGAA 14820 

ACGACGTAAA AAGGTTTCAA GTGAGTCTTT GTGATTGGCT GTATTTTGTA AATCAAAGCC 14880 

ACATTTTTTT AGTTCAGATA AGACATTTTC TTTTGGAAAA TTGATATAAC TATATTGATT 14940 

AAAACGCATA GAACCTCCAT ATAGAATGAC AGTTAAGGTT ATTATATCAA AAAAAAAGCA 15000 

GAAAGGGAAT TGTTAACTTC AAAAGGAAAT AATCCAATAA AAATGAATAA AGTACTAAAT 15060 

TCAATATAGA GAACAGAGTA ACAATAAGAA TAAATAGATA GGGTATAAAA GTTCTAGGAG 15120 

ATTTATATTA TATGCTTTCT ATTTTTATAT ACAATATAGT ATAAATATAA AAATGATGAC 15180 

AAAAATACAA ATGAATAGAA AATAAATTAG TAAGCTGATG AAATTTTTCT CAAGAGAAGC 15240 

CATTTATAGG- TGAAAATGGT ATAATATAGT GAGAAGGATA GAGGAGAAGT GTAAATTGAT 15300 
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CGCACAACTA GATACAAAAA CAGTCTATAG TTTTATGGAA AGCGTCATTT CGATCGAAAA 15360 

GTATGTGAGA GCAGCTAAAG AATACGGCTA CACTCATTTG GCTATGATGG ATATTGACAA 15420 

TCTTTATGGC GCTTTCGACT TTCTAGAGAT TACAAAAAAA TACGGCATTC ATCCTTTGCT 15480 

AGGGCTTGAA ATGACAGTGT TTGTAGATGA TCAGGGAGTG AATTTGCGCT TTTTAGCTCT 15540 

ATCTAGTGTG GGCTATCAGC AGTTGATGAA GCTTTCGACA GCCAAGATGC AGGGGGAGAA 15600 

AACTTGGTCA GTCCTGTCCC AGTACCTGGA GGATATCGCG GTCATTGTGC CTTATTTTGA 15660 

TAGAGTTGAG TCGTTAGAAC TAGGCTGTGA TTACTATATA GGGGTTTATC CAGAAACACT 15720 

AGCAAGCGAA TTTCATCATC CTATCTTACC TCTTTATCGG GTCAACGCTT TTGAAAGCAG 15780 

GGATAGAGAA GTTCTTCAAG TTTTAACAGC GATTAAAGAA AATCTACCGC TCAGAGAAGT 15840 

TCCCTTGCGT TCGAGACAAG ATGTCTTTAT ATCAGCAAGT TCTTTAGAGA AACTATTCCA 15900 

AGAGCGTTTT CCGCAAGCTT TGGACAATTT AGAAAAGCTT ATTTCAGGCA TTTCTTACGA 15960 

CTTGGATACT AGTCTGAAWC TGCCTCGTTT TAATCCAGCT AGACCAGCAG TAT5AGGAGTT 16020 

GAGAGAGCGT GCTGAACTGG GGCTTGTTCA GAAGGGGTTG ACTAGTAAAG AATATCAAGA 16080 

TAGACTAGAC CAAGAATTGT CTGTTATTCA TGATATGGGC TTTGATGATT ATTTCTTGGT 16140 

TGTTTGGGAT TTGTTGCGTT TTGGACAATC GAATGGCTAT TATATGGGAA TGGGAAGGGG 16200 

TTCTGCAGTA GGCAGTTTGG TTTCTTATGC CTTAGACATC ACGGGGATTG ACCCAGTAGA 16260 

GAAAAATCTG ATTTTTGAAC GCTTTCTTAA TCGTGAACGC TATACCATGC CTGATATTGA 16320 

TATTGATATC CCAGATATTT ATCGTCCAGA TTTTATCAGA TATGTTGGTA ATAAATATGG 16380 

TAGTAAACAT GCGGCACAAA TCGTTACTTT TTCAACCTTT GGAGCCAAGC AAGCTCTTCG 16440 

AGATGTCTTG AAACGCTTTG GTGTGCCAGA GTATGAATTA TCTGCAATTA CTAAGAAAAT 16500 

CAGTTTTCGT GACAATCTTA AGTCGGCCTA TGAGGGAAAT CTCCAGTTTC GTCAGCAAAT 16560 

CAATAGTAAG TTAGAATACC AAAAAGCTTT TGAGATTGCT TGCAAGATAG AGGGCTATCC 16620 

AAGGCAAACC TCTGTCCATG CGGCTGGTGT TGTAATTAGT GACCAAGATT TAACCAACTA 16680 

CATTCCTCTA AAGTATGGTG ATGAAATTCC ACTGACTCAG TATGATGCTC ATGGAGTTGA 16740 

GGCTAGCGGA CTTTTGAAGA TGGACTTTCT GGGACTACGA AATTTGACCT TTGTCCAGAA 16800 

GATGCAAGAG TTGCTTGCTG AAACAGAAGG TATTCATCTG AAAATTGAAG AAATCGATTT 16860 

AGAAGACAAA GAAACGTTAG CTTTATTTGC CTCTGGTAAT ACAAAAGGTA TCTTTCAATT 16920 

TGAGCAACCA GGTGCCATTC GTCTGCTTAA GCGTGTGCAA CCAGTCTGTT TVGAAGATGT 16980 

CGTCGCGACT ACTTCTCTAA ATCGACCGGG TGCTAGTGAC TATATCAATA ATTTTGTGGC 17040 

AAGAAAGCAT GGGCAGGAAG AAGTGACTGT TCTGGATCCA GTACTGGAGG ATATTTTGGC 17100 



WO 98/18931 



PCT/US97/19588 



465 

TCCAACCTAC GGCATAATGC TCTATCAGGA CC^TC CAGG^GCCC AGCGAC.GC 
CTTCGGAAAG <*«**«™ -CTCOCCCT ATGGGGAAAA AOCATCCC.C 
TGCCATGCAT GAGATGAGGG C^AT ^ TTAGAACCTG G^ATAC^ 
GGAAAAAGCA GAGCAGGTCT 

OTCACACOCC TATGCCTACT CAGCC^CC C^CCA^ AAACGCA^A 
TCCAGCCATT TTTTATCAGG TCATGTTAAA T.cncouc AGTGATTACT TAATAGATGC 
ACTTGAAGCA GGTTTTGAAG IMCnCTCT ATCCATCAAC ACCA^CCCT ATCACGATAA 
AATTGCCAAC AAGGCCATCT ATCTAGGTTT GAAATCCATT AAAGGAGTCA GTAATGATTT 
****** ATTATTGAAA ATAGACCTTA ^AACA,. GAAGATTTTA TAGCTAAATT 
ACCTGAGAAT TATCTGAAAC TTCCTCTGCT AGAACCTTTG GTAAAAGTTG GTCTTTTCGA 
TTCATTTGAA AAAAATCGTC AAAAAGTATT TAATAACTTA GCTAATCTAT ^AA^ 

gaaagag™ ggaag^g, ttggagatgc tatttatag, tggcaggaa, cggaaca™ 

GACGGAACAA GAAAAATTTT ATATGGAACA AGAGCTTTTA GGGATAGGTG TCAGCAAACA 
TCCACTACAA GCTATTGCAA GTAAGGCTAT TTACCCGATT ACCCCAATCG GAAA^TGXC 
AGAAAATAGC TATGCTATTA AGTTCAGAAA ATAAAAGTGA TTCGTACCAA 

AAAGGGTGAA AATATGGCCT TCTTACAGGC AGATGATAGT AAGAAAAAAT TGGATGTCAC 
TCTCTTTTCA GACTTATATC GTCAGGTTGG ACAGGAAATA AAAGAGGGAG CCPTC.ACA 
TGTAAAAGGA AAAATACAAT CACG TCAT GG CCG TCTC CAA ATGATTGCAC AAGAAATAAG 
AGAAGCAGTT GC^AACGC rrr^ GGTGAAAAAT CATGAATCGG ATCAAGAAAT 
-CACGCA*. TTAGAACAAT TTAAAGGCCC AATCCCAGTC ATCATCCGGT ATGAAGAGGA 
ACAGAAAACC AW8mcTC ^ 

ATTGAATGAA ATCGTTATGA AAACGATTTA TCGCTAAAAA TACGGAAAAT AGAAGAATTT 
TCAACGTAAA TGTGGTATAA TCAGTAAGAA TGTTAAAAGA AAAAGGAGCA TAACCAATAT 
GAAACGTATT oCWm** CTAG^ 

TGCAGTTG1T CGTCAAGCAA TTTCAGAAGG AATGGAAGTT TTTGGTATCT ATGACGGATA 
TGCTGGTATG GTTGCCGGTG AAATTCATCC CCAGA^CA GCTTCAGTAG GGGACATCAT 
TTCTCGTGGT GGTAC^CC ^CAC^GC TCGTTACCCA CA^CGCC AACTTGAAGG 
GCAACTTAAA GGGATTGAGC AATTGAAAAA ACACGGAATT GAAGGTGTAG TTGTTATCGG 
TGGTGACGGA ^PTACCACG GCGCTATGCG TTTGACTGAA CA^GCTTCC CAGCTATTGG 



17160 
17220 
17280 
17340 
17400 
17460 
17520 
17580 
17640 
17700 
17760 
17820 
17880 
17940 
18000 
18060 
18120 
18180 
18240 
18300 
18360 
18420 
18480 
18540 
18600 
18660 
18720 
18780 
18840 
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TCTTCCAGGT ACAATCGATA ACGATATCGT TGGTACTGAC TTTACAATCG GTTTTGACAC 18900 

AGCGGTTACT ACTGCCATGG ACGCTATCGA TAAGATTCGT GATACATCAT CAAGTCACCG 18960 

TCGTACTTTT GTAATCGAAG TTATGGGACG TAACGCTGGT GATATCGCTC TTTGGGCTGG 19020 

TATTGCAACT GGTGCTGATG AAATCATCAT CCCTGAAGCA GGCTTCAAGA TGGAAGATAT 19080 

CGTAGCAAGC ATCAAAGCTG GTTATGAATG TGGTAAAAAA CACAATATTA TCGTCTTAGC 19140 

TGAAGGTGTG ATGTCAGCGG CTGAATTTGG TCAAAAACTT AAAGAAGCTG GAGATACAAG 19200 

CGACCTTCGT GTAACAGAAC TTGGACATAT TCAACGTGGT GGTTCTCCAA CTGCGCGTGA 19260 

CCGTGTTTTG GCGTCACGTA TGGGTGCACA TGCTGTTAAA CTTCTTAAAG AAGGTATCGG 19320 

TGGTGTTGCG GTTGGTATTC GTAACGAAAA AATGGTTGAA AATCCAATTC TTGGTACTGC 19380 

AGAAGAAGGG GCATTGTTTA GCCTTACTGC AGAAGGTAAG ATTGTGGTTA ACAACCCAGC 19440 

TACAAA 19446 



(2) INFORMATION FOR SEQ ID NO: 52: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16593 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

TCGTAAATAT GCTCTGTTTT TGGATTTTGT TTCTTAATCT GTTTGGCAAG TGCCTTCATC 60 

ATAGAAATAG GACCACACAT ATAGACGGTT GCATGTTCGG GCACTTCTTT TTGTTCAAAA 120 

TTAAGATAGC CGTCTTTCGT ACTGTCGATT AGATGGAGTT CAAAATTAGG ATTTTTCTGA 180 

GCATAGTTAC GGAGTAAATC TAGGTAGACT GCATTTTCAT CTCCACGGAA GCTATAGTAG 24 0 

AAGTGAACCT GTTTATCTAA AATAGGATGT TCACGGATGT AAGAGATGAA GGGGGTGATC 300 

CCAATACCTC CAGCAATCCA AACCTGATTT TCTCGTCCTT CTTCTATGAT CATGTGTCCG 360 

TAAGCTCTGT CTAGGGTTAC TTTGCTGCCG GCTTGAAGAT TATCATAGAT ATTCTTGGTA 42 0 

TGGTCGCCTG AAGTTTTAAC AGTAAAGTAA AGAGTTTGAC CATGACCTCC TGAGATAGAA 480 

AAGGGATGCG GAGCACTTTC AAAGCCTTCT TGGAAAATCT TTAGAAAGGC AAATTGTCCT 540 

GATTGATAGT TGAAAGGTCT GCTAAGATGG ATTTGAATTT CTCTAGTATC GTGATTTAAG 600 

CGTTTGAGAT GGGTAATTTT CCCTAGATAG GGGAAGGAAA TCTTTTGATA TAGAAAAATG 660 

ATATAAAAAC CAGCTAGTAA GCCTAAAAGG GCATAGCTAC CAACAAGAAA ACTTAGAAGA 720 

TTAAATGTAA GGAGACGATT GCCCATTATC ATGTAGATGT GAAAGAGTCC TAAAATATAG 780 
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GCTAGGTAAA CCAGGCGGTG AATCCATCGC CAAGCTTCGT ATTGGATGTA TTTGCCTAAA 840 

TAGGCGACAA GGATGATGCT GGCAAAGATA TAGATGGCAA GATTGCCAAA CTGAGCAGCT 900 

AAGCGAGAGC CCCACAAACC GCCCATACTA AAGTTATGAA AGATTAGTAG GATGATTGAG 960 

AGAAAGGCTG TGAATTTGTG GACGGTGTAG ACCTTCTCCA AACTGTGAAA CCAGCTTTCT 1020 

AGTAGTGGGA GACGAGTGGC TAGGATAAAA GTCAGAGATA GGCTTGTTAA AGCTAGTCCT 1080 

GGAATCATGA ATTGGGGAGA AGTGTTCATC CAAGTCAAAA GAGTCAAGAT AAAACTAGCT 1140 

ATGATAAAGA GTAGTCCTTT GACTGATTTC ATAGAAAATT CCATTTCATT TAGATTTCGA 1200 

TTTGTTGTAA ATAAATTTGT TACATTTTAT CATAGAAAAT GTATGGTGTC AAATTGAGGT 12 60 

CTATAAATAT CTACTCTCAT CAAAAAACTC TCCAATTGAA CTGGAGAGTG GCTGTTTATA 1320 

CTCAATGAAA ATCAAAGAGC AAACTAGGAA GCTAGCCGCA AGTTGCTCAA AACACTGTTT 1380 

TGAGGTTGCA GATAGAGCTG ACGTGGTTTG AAGAGATTTT CGAAGAGTGT TATTCTGCAG 1440 

CTTGTTGCCA ACGTTTGGCT AGCATATGAG ACAGGCTAGA AATTGCTAGG TTAAAGCTGA 1500 

AGTAGATGAG GGCAATCAGG ATGTAAAGAC TGAAGACCTG CTCTGGTTCG AAAT AACGG C 1560 

CCATGAGAAT TTGGCTGGCT CCAAAGAGTT CTTGTAGGGC GATAACAGAG TAGAGGAGAC 1620 

TGGTATCCTT AATCACGGTA ACAAACTGAG AAATGATGGC TGGTAGCATT TTGCGGATGG 1680 

CTTGTGGGAG AATGATGTAG TAGAGGATTT GGGCTGAGGT GAAGCCTTGT GACATTCCTG 1740 

CTTCGTACTG TCCCTTGTCT ACGGCATTGA GACCGCCTCG AATAATCTCA GCCAAGGCTG 1800 

CTGATGTAAA GAGAGTAAAG GCTGTAATAC CTGCTGGTGT GGATTTCATT TTGAACACCA 1860 

AAAAGATAGT AAAAATCCAG AGAAGGTTGG GAACGTTGCG CACAAACTCG ATATAAATAC 1920 

TGGAAATAAT GCGTAAGACA GGATTTTTGC CATTTCTCGT GACAGCTAGC ACCGTACCGA 1980 

TGATAGTAGA GAGGATGATG GCAATCAGAG AAATATAGAG GGTCAAGCCA AATCCTTTAA 2040 

AGATAAAGAC TAGGTTATCT GGGGTTAAAA CTTCTAAAAT AGATTCCATA GTAACCTCCT 2100 

AAAGTGAATA GGCTTTTTTG TTGGCTTGCT CCATCTTGCG ACCAAACTGG GCAACAGGGA 2160 

AGCATAGAGC AAAGTAGAGA AGAGCAGCAC CTAAAAAGGC TGGTATATAG TTTCCGTTGA 2220 

GAGCCGACCA AGACTTAGTC ACAAACATCA AGTCTACTCC AGAGATGATA GCTACAGTAG 2280 

AGGTGTTCTT GATGAGGTTA ACAATTTGGT TGGTCAATGG AGGGAGAATG ATGCGGAAGG 2340 

CCTGAGGCAA GATAATCAAG CGCATGGCAC TGATATAGGT AAAACCTTGC GACAAGGCGG 2400 

CCTCCATCTG ACCACTAGGA ATAGACTGAA TCCCTGAACG AATAACCTCA GCGATATAAG 2460 

CGCCGTGATA GAGTCCCACG CAGAGAACGG CTGTCCAATA AATTGGAATC ATGATGATAT 2520 
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GGTCACTGAT AAGAGGTAGG CCATAAAAAA CAATAACAAA CTGCACCAAG AGGGGAGTAT 2580 
TTTGGTAAAA TTCAACAAAG ATGCGAGCTA AAATGCGTAA AATTGGACGT TTACTGGTTG 2640 
ACATGGCACC AAAGAAGATG CCCAAAACCA TAGCGAGGAT AAAGGAACCA ACCGCTAGGG 2700 
CAAGGGTGAA GAGGAAACCA TTGAAAAATT GTCCAAAATC CTGAAAATAG GCTGTCCAAG 27 60 

ATGATAAATC TGTCATGQGG TGTCCTCCTT AATCTGCAGT ATGGCTAGAT GGTTTGAGCT 2820 
TGTAACGGTC ATAAAGTTTC TGCAAACTAC CATCCTTGCT CCATTTAGTA ACCAAGTTAT 2880 
CAAGATAGTC GTTGAGCTCT GTATTTGATT TCTTGGTAAC AATACCGTAG TCAGATGGCT 2940 
TGAAACTATC ATCTAGTAGT GCTGTCCGTT TACTAGTGTA GCCAGATAGA ATAGAGCGGT 3000 
CAACGGAAAA GGTATCGATA CGATGAGCGT GCAGGGAAGT AATCAATTCT GGGTAGGAAC 3060 
CAAGTTCGAC GAATTTAAAC TTCAGACCTT TCTTTTTACC CAGTTCAGTA ATCAGGCGTT 3120 
GGGTGATAGA ACCTTGGGCG ACTCCGATGG TTTTGCCGTT TAGGTCCTCA ATCTTTTTGA 3180 
TTTTGGCAGA TTTATTGACC AAAAATCCAG AAGCGTCTGT GTAGTAGGGA CTGGTAAAGT 3240 
TGTAGAGTTT TTTGCGTTCG TCCGTGATGG TAAAGGTCGC GATA'i'CCATA TCGACCTGTT 3300 
CATTGTCTAG AAGGGGGCCG CGGGTTTGTG CTGTAACCGG CACATAGCGA ATCTTGACCT 3 360 

TGAGTTCATC AGCTACCATC TTGGCCAAGT CGGTTTCGAT ACCAGAATAA GTACCGGTCT 3420 

TGGGATCTTT GTAACCAAAA TTGGGAACGT CTTGTTTGAC ACCGACAACC AGTTCGCCTC 3480 

TTTTTTGAAT GTCTGCGATA CTTGTATCAG CCTGGACTGG TTTGGCAGCA GCAAGGCCGA 3540 

AAAGGCTAAT CAATAATGCT GATAAAAAGA ATTTTTTTTC ATAGGCGCCT CCTTATTTGA 3 600 

CTTTGTCACT TTCGTGGTTG ATAATTTTGC TGAGGAATTG TTGGGCACGA GGTTCGCTTG 3 660 

GATTGTCAAA AAAGTTATCG ACATCTGTCG TATCTACTAA AACTTCTCCG TCGGCCATAA 3720 

AGATAATGCG GTCCGCAACC TCTCGAGCAA AGCCCATTTC GTGGGTAACG ATGATCATGT 3780 

TCATCCCATC ATGCGCCAGT TTCTGCATAA CTGCTAGAAC ATCTCCGATA GTCTCAGGAT 3840 

CAAGAGCAGA TGTTGGTTCA TCAAAGAGGA GGAGTTCCGG ATGCATAGCA AGACCACGAG 3900 

CGATGGCGAT CCGCTGTTTT TGTCCACCAG ATAGCATGGC GGGATAGGAA TCTTTCTTGT 3960 

CCCACATATT TACAAATTCC AGATATTTTT GGGCGGTTTT TTCAGCTTCT TTTTTATCAA 4020 

TTCCTAGAAC TTCAATGGGT GCAAGCGTTA CGTTTTCTAA CACAGCTTTG TGTGGATAAA 4080 

GGTTAAAATG TTGAAAAACC ATGCCGACTT CCTTGCGAAG AGGTACCAAA TCTTTCTGGC 4140 

TGGCACCAGC AACTTGGTGC CCATTGACTA GGAGACTTCC TTTGTCAACA GTCTCTAAAC 4200 

CATTGATCGT ACGGATAAGA GTGGACTTCC CAGAGCCAGA AGGTCCAAGC AGGACAACAA 4260 

CTTGTCCTTT TTCAAAACGG AGATTGATGT TGCGGAATGC GTGGTAGTCT CCGTAATATT 4320 
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TTTCGACGTT TTTAAATTCT ACTAAAGCCA TGAGAGATCT CTATTGTGTT ATATTTTATA 4380 

ACACGGTTCT ACAATAAAAG AATGTTCTTG TCAAATCATA TCTGAAAAAA TTC ACT AT AG 4440 

TGAAATAAGA ACAGGAAAAA TCGATCGGGA CAGTCAAATC GATTTCTAAC AATATTTTAG 4500 

AAGTAGAGGT GTACTATTCT AGTTTCAATA TACTATAAAA TGTTATAAAA AAGCAATCTG 4560 

GATAGAGAAA ACGTCTAAAT CATGTTATAA TGAAGCAATA GAATTCTTAG AAAGAGTGGA 4620 

TGTCTTTTTG ATAACACCTA CTTATGAATG GCAGTTTGCC CTGCAGGTAG AAGATGCGGA 4680 

TTTTACAAAG ATAGCCAAGA AGGCTGGACT GGGTCCTGAG GTGGCTCGGT TATTGTTTGA 4740 

GAGAGGGATT CAGAACCAAG AAAGTCTGAA GAAGTTTTTA GAACCTTCCT TGGAGGACTT 4800 

ACATGATGCT TATCTGCTCC ATGATATGGA CAAGGCAGTG GAGCGGATTC GTCAGGCTAT 4860 

TGAAGAAGGG GAAAATATTC TTGTTTATGG AGACTATGAT GCGGATGGCA TGACTTCGGC 4920 

TTCTATTGTG AAGGAAAGTT TGGAACAACT TGGTGCTGAG TGCCGAGTTT ACCTGCCAAA 4980 

TCGTTTTACC GATGGCTATG GCCCTAATGC TAGTGTTTAT AAATACTTTA TCGAGCAAGA 5040 

AGGGATTTCC TTGATTGTGA CGGTGGACAA TGGGGTTGCT GGTCATGAGG CTATTGCATT 5100 

GGCTCAGTCT ATGGGAGTAG ATGTCATTGT GACAGACCAT CATTCCATGC CTGAAACCCT 5160 

GCCAGATGCT TATGCTATTG TCCATCCTGA ACATCCAGAT GCGGATTATC CTTTTAAATA 5220 

TTTGGCTGGT TGTGGAGTTG CTTTCAAGTT GGCTTGTGCC CTGTTAGAAG AAGTGCAAGT 5280 

GGAATTGCTT GATTTGGTCG CTATTGGAAC TATTGCAGAT ATGGTGAGTC TGACGGATGA 5340 

AAATCGTATC TTAGTTCAAT ATGGTCTGGA AATGTTGGGT CATACCCAGC GCATTGGTCT 5400 

GCAAGAAATG CTGGACATGG CTGGGATTGC TGCCAACGAA GTAACAGAAG AAACGGTTGG 5460 

TTTCCAGATT GCTCCTCGTT TGAATGCCTT GGGTCGCTTG GATGATCCCA ATCCTGCCAT 5520 

TGATTTGTTG ACTGGATTTG ATGATGAGGA AGCGCATGAG ATTGCCCTTA TGATTCACCA 5580 

GAAAAACGAA GAGCGCAAGG AAATCGTTCA GTCTATCTAT GAAGAAGCCA AGACCATCGT 5640 

GGATCCTGAG AAGAAGGTTC AGGTCTTGGC CAAGGAAGGC TGGAATCCTG GGGTTCTAGG 5700 

AATCGTGGCT GGTCGTTTAT TGGAAGAATT GGGACAGACA GTCATTGTTC TTAATATAGA 5760 

AGACGGTCGT GCCAAGGGCA GTGCPCGTAG TGTGGAAGCG GTCGATATTT TTGAAGCTCT 5820 

GGATCCCCAT CGAGACCTCT TCATCGCCTT TGGAGGTCAT GCAGGTGCAG CGGGTATGAC 5880 

GCTGGAAGTT GAGCAACTCT CAGATTTATC TCAGGTTTTG GAAGATTATG TTCGTGAAAA 5940 

AGGTGCAGAT GCTGGTGGCA AGAATAAGTT AAACCTAGAT GAAGAGTTGG ATTTGGAGGC 6000 

ACTTAGCTTG GAAACGGTCA AAAGTTTTGA ACGTTTAGCT CCTTTTGGAA TGGATAATCA 6060 
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GAAACCTATT TTTTATATCA AGAATTTTCA GGTCGAAAGT GCTCGTACTA TGGGGGCAGG 6120 

TAATGCCCAT CTAAAGCTGA AAATTTCCAA GGGTGAGGCG AGTTTTGAAG TGGTAGCCTT 6180 

TGGTCAAGGC AGATGGGCGA CAGAGTTTTC TCAAACCAAG AATCTAGAGT TAGCGGTTAA 6240 

ATTGTCTGTC AACCAATGGA ATGGCCAAAC TGCCCTCCAG TTGATGATGG TGGATGCGCG 6300 

AGTGGAAGGT GTTCAACTTT TTAACATTCG TGGAAAAAAT GCAGTCTTGC CAGAAGGTGT 6360 

TCCAGTCTTG GATTTTCCTG GAGAACTGCC AAATCTTGCG GCTAGTGAAG CTGTTGTCGT 6420 

AAAAAACATT CCAGAGGATA TTACTCAGCT GAAGACCATT TTTCAGGAAC AGCATTTCTC 6480 

TGCTGTCTAT TTCAAAAATG ATATTGACAA GGCTTATTAT CTGACAGGTT ATGGGACTAG 6540 

AGATCAGTTT GCCAAATTGT ACAAGACTAT TTACCAGTTC CCAGAGTTTG ATATTCGCTA 6600 

CAAGCTGAAA GATTTGGCTG CATATCTTAA TATTCAACAA ATCTTGCTGG TCAAGATGAT 6660 

TCAAGTATTT GAAGAACTAG GCTTTGTGAC GATAAAAGAT GGTGTGATGA CAGTCAATAA 6720 

AGAGGCGCCA AAGCGGGAGA TAGGAGAAAG TCAAATTTAC CAAAATCTCA AACAAACCGT 6780 

TAAAGACCAA GAAATGATGG CGCTGGGTAC GGTGCAAGAA ATTTATGATT TTTTGATGGA 6840 

AAAAGAGTAG AAGTTAGGAA AGAGTTGGGA AATCAACTCT TTTTTGAAAA CAGACCTTCA 6900 

TTTTGAAAAT CATCAAAAAA ATGGTATAAT GGTAGGAAAA GATTCGGCTG AAAGTATCAG 6960 

AACTTTTAGA ATAAGAGGGT AGAATTGCCC TATAATCAAG ATAAACTAAG ATTTTGGAGG 7020 

AAAAATGAGT AATATCAGTT TAACAACACT TGGTGGTGTG CGTGAGAATG GAAAAAATAT 7080 

GTACATTGCT GAAATTGGAG AGTCCATTTT TGTTTTGAAT GTAGGGTTAA AATATCCTGA 7140 

AAATGAACAA TTAGGGGTCG ATGTGGTGAT TCCAAACATG GATTACCTTT TTGAAAATAG 7200 

CGACCGTATT GCTGGGGTTT TCTTGACCCA CGGGCATGCG GATGCCATTG GTGCTCTACC 7260 

GTATCTCTTG GCAGAGGCTA AAGTTCCTGT ATTTGGGTCT GAGTTGACCA TTGAGTTGGC 7 320 

AAAGCTCTTT GTCAAAGGAA ATGATGCCGT TAAGAAATTT AATGATTTCC ATGTCATTGA 7380 

TGAGAATACG GAGATTGATT TTGGTGGGAC AGTGGTTTCC TTCTTCCCTA CGACTTACTC 7440 

CGTTCCAGAG AGTCTGGGAA TTGTCTTGAA GACATCGGAA GGAAGCATCG TTTATACAGG 7500 

TGACTTCAAA TTTGACCAAA CGGCTAGTGA ATCTTATGCA ACTGATTTTG CTCGTTTGGC 7560 

AGAGATTGGT CGTGACGGCG TCCTGGCTCT CCTCAGTGAT TCGGCCAATG CAGACAGCAA 7620 

TATTCAGGTG GCTAGTGAAA GTGAAGTTAG GGATGAAATT ACCCAAACTA TTGCTGACTG 7680 

GGAAGGTCGT ATCATCGTTG CAGCTGTTTC CAGTAATCTT TCTCGTATTC AGCAGATTTT 7740 

TGACGCTGCG GATAAAACAG GTCGACGTAT CGTCTTGACA GGATTTGATA TTGAAAATAT 7800 

CGTCCGCACA GCGATTCGTC TTAAGAAGTT GTCTTTAGCC AACGAAATTC TTTTGATTAA 7860 
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GCCTAAAGAT ATGTCTCGCT TTGAAGACCA TGAGTTGATT ATTCTTGAGA CAGGTCGTAT 7920 

GGGTGAGCCT ATCAATGGAC TTCGTAAGAT GTCGATTGGT CGCCATCGTT ATGTAGAAAT 7980 

CAAGGATGGG GACCTAGTCT ATATTGCTAC GGCTCCGTCT ATTGCTAAAG AAGCCTTTGT 8040 

TGCGCGTGTG GAAAATATGA TTTATCAGGC AGGTGGGGTT GTCAAATTGA TTACCCAAAG 8100 

TTTACATGTA TCAGGGCACG GAAATGTGCG TGATTTGCAG CTGATGATCA ATCTTTTGCA 8160 

ACCTAAGTAC CTCTTCCCTG TCCAAGGGGA GTATCGTGAG TTGGATGCTC ACGCTAAGGC 8220 

TGCCATGGCA GTTGGGATGT TGCCAGAACG CATC TTC ATT CCTAAAAAGG GGACGACCAT 8280 

GGCTTACGAG AATGGAGACT TTGTTCCAGC TGGATCGGTT TCAGCAGGAG ATATCTTGAT 8340 

TGATGGGAAT GCCATTGGTG ATGTTGGAAA TGTTGTTCTT CGTGACCGTA AGGTCTTGTC 8400 

AGAGGATGGA ATTTTCATCG TGGCTATTAC AGTCAACCGT CGTGAGAAGA AAATTGTGGC 8460 

TAGGGCTCGT GTTCACACGC GTGGATTTGT TTATCTCAAG AAGAGTCGCG ATATTCTCCG 8520 

TGAAAGTTCA GAATTGATTA ACCAAACGGT AGAAGAGTAT CTTCAAGGAG ATGACTTTGA 8580 

CTGGGCAGAT CTCAAAGGTA AGGTTCGTGA CAATCTGACC AAGTACCTCT TTGATCAAAC 8640 

CAAGCGTCGC CCAGCCATTT TACCAGTAGT CATGGAAGCA AAATAATCGT TGAAATAAAC 8700 

AGAGAGAAAG TCGAGTTTCG GCTTTTTCTT ATAGAAAAAT AGAAGGAGAA AATCATGGCA 8760 

GTGATGAAAA TCGAGTATTA CTCACAAGTA TTGGATATGG AGTGGGGGGT GAATGTCCTC 8820 

TACCCTGATG CCAATCGAGT GGAAGAACCA GAGTGTGAAG ATATTCCCGT CTTGTACCTT 8880 

TTGCACGGGA TGTCTGGAAA TCATAATAGT TGGCTTAAGC GGACCAATGT AGAACGCTTG 8940 

CTTCGAGGAA CTAATCTCAT CGTTGTTATG CCCAATACCA GCAATGGTTG GTACACCGAT 9000 

ACCCAGTATG GTTTTGACTA CTACACGGCT CTAGCAGAGG AATTGCCACA GGTTCTGAAA 9060 

CGCTTCTTCC CTAATATGAC GAGCAAGCGT GAAAAGACCT TTATCGCTGG TCTTTCTATG 9120 

GGAGGCTACG GCTGCTTCAA ACTGGCTCTT ACGACAAATC GTTTTTCTCA TGCAGCTAGT 9180 

TTTTCAGGTG CCCTCAGCTT TCAAAACTTT TCTCCTGAAA GTCAAAATCT GGGAAGTCCA 9240 

GCCTACTGGA GAGGTGTTTT TGGAGAGATT AGAGACTGGA CAACTAGTCC CTATTCTCTT 9300 

GAAAGTCTGG CTAAAAAATC GGATAAAAAG ACCAAACTTT GGGCGTGGTG TGGCGAACAG 9360 

GATTTCTTGT ACGAAGCCAA TAATCTCGCA GTGAAAAATC TCAAAAAACT AGGTTTTGAT 9420 

GTGACCTATA GCCATAGCGC TGGAACTCAC GAGTGGTACT ACTGGGAAAA ACAATTGGAA 9480 

GTTTTTTTAA CAACCCTACC AATTGATTTC AAATTAGAAG AGAGACTGAC TTAGTTTGAA 9540 

CTTCAGCATA GGGGGAGTAG AACTAAAATA AAATATGTTT TCACTAGACT TTTCAAACGm 9600 
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AAGTAGTAGA ATAGTAATAA AATACTGGAG GAAAGAGAGT AGGAAATGTA CCGTTATCAA 9660 

ATTGGCATTC CCACATTAGA ATATGATCAG TTTGTCAAAG AACATGAATT AGCCAATGTA 9720 

TTACAAAGTA GTGCTTGGGA GGAAGTTAAG TCTAATTGGC AACATGAGAA GTTTGGTGTT 9780 

TACAGGGAAG AAAAATTACT GGCGACAGCT AGTATTTTGA TTAGAACTCT TCCGCTAGGC 9840 

TATAAAATGT TTTACATCCC AAGAGGACCT ATATTGGATT ATGGGGATAA AGAACTCTTG 9900 

AATTTTGCCA TTCAGTCTAT TAAGTCCTAT GCTCGCAGTA AGAGAGCGGT TTTTGTGACT 9960 

TTTGACCCAA GTATTTGCCT ATCTCAAAGT TTAATCAATC AGGAAAAGAC AGAATTTCCT 10020 

GAAAATCTGG CTATTATTGA TAGTTTGCAA CAAATGGGAG TAAGGTGGTC AGGAAAAACG 10080 

GAGGAAATGG GAGACACCAT TCAACCTCGT ATTCAGGCGA AAATATACAA GGAAAATTTT 10140 

GAAGAAGATA AACTTTCCAA GTCAACAAAA CAGGCTATTC GAACAGCACG AAACAAAGGG 10200 

CTTGAGATTC AATATGGTGG ACTGGAACTA TTAGATTCAT TTTCGGAGTT GATGAAAAAA 10260 

ACTGAGAAGC GAAAAGAGAT TCATTTGAGG AATGAAGCCT ATTATAAAAA ATTGTTAGAT 10320 

AATTTTAAGG ACAAGGCCTA TATCACCTTG GCCACCTTGG ATGTTTCTAA ACGTTCGCAA 10380 

GAGTTAGAAG AACAGTTAGC GAAAAATAGA GCCTTGGAAG AGACCTTTAC TGAGTCGACT 10440 

CGAACTTCAA AAGTAGAAGC GCAGAAGAAG GAAAAAGAAC GTTTGTTAGA GGAATTGACC 10500 

TTCTTGCAGG AATATATAGA TGTAGGTCAA GCGAGAGTTC CTTTAGCGGC TACTTTGAGT 10560 

TTGGAATTTG GTACTACCTC TGTCAATATA TATGCTGGTA TGGATGATGA TTTTAAACGT 10620 

TACAATGCAC CAATTTTAAC TTGGTATGAA ACGGCTCGCT ATGCCTTTGA ACGAGGTATG 10680 

ATCTGGCAAA ATTTAGGTGG TGTTGAAAAC TCTCTCAATG GTGGACTTTA TCATTTTAAG 10740 

GAAAAATTTA ATCCAACGAT TGAAGAATAC TTGGGTGAAT TTACAATGCC CACTCATCCT 10800 

CTCTATCCTC TGTTAAGACT TGCTCTTGAT TTCCGTAAAA CATTAAGAAA AAAACATAGA 10860 

AAGTAAGTAT ATGGCACTAA CAACACTCAC GAAAGAAGAG TTTCAGACTT ATTCTGATCA 10920 

GGTTTCTTCT CGTTCCTTTA TGCAATCTGT CCAGATGGGG GATTTCCTAG AAAAAAGAGG 10980 

GGCTCGAATT GTTTATCTTG CTTTGAAACA AGAAGGAGAA ATTCAAGTTG CAGCTCTGGT 11040 

TTATAGCCrG CCCATGCTGG GTGGTCTGCA TATGGAACTC AATTCGGGGC CGATTTATAC 11100 

CCAACAAGAT GCTCTTCCAG TTTTTTATGC AGAGTTAAAA GAATATGCCA AGCAAAATGG 11160 

TGTATTAGAG TTGCTTGTAA AACCCTATGA AACTTATCAA ACTTTTGATA GCCAAGGTAA 11220 

TCCAATAGAT GCTGAGAAAA AAAGTATTAT TCAAGATTTG ACTGATTTAG GTTATCAATT 11280 

TGATGGCTTA ACAACAGGTT ACCCAGGTGG AGAACCAGAT TGGTTATACT ATAAAGATTT 11340 

AACTGAATTA ACTGAAAAGA GTTTGCTTAA AAGTTTTAGC AAAAAGGGTA AACCCTTGGT 11400 
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GAAAAAGGCT 


GAAACCTTTG 


GCATTCGGTT 


GAAAAAGTTA 


AAACGTGAAG 


AACTATCGAT 


11460 


TTTTAAGAAT 


ATAACAAAAG 


AAACCTCTGA 


ACGTAGAGAA 


TATAGTGATA 


AAAGTTTAGA 


11520 


ATATTATGAG 


CATTTTTATG 


ATACTTTTGG 


AGAACAAGCG 


GAGTTTCTCA 


TAGCAAGCTT 


11580 


AAATTTTTCG GACTATATGA GCAAATTGCA AGGTGAACAA AGTAAACTAG 


AAGAAAACTT 


11640 


GGACAAGTTG 


CGACTTGATT 


TGAGTAAAAA 


TCCTCATTCT 


GAGAAAAAAC 


AAAATCAACT 


11700 


GAGAGAATAT 


TCTAGTCAAT 


TTGAAACGTT 


TGAAGTTCGA 


AAAGCAGAAG 


CGCGAGACTT 


11760 


GATTGAAAAA 


TATGGAGAAG 


AAGATATTGT 


TTTAGCTGGG 


AGTTTATTTG 


TTTATATGCC 


11820 


TCAGGAAACG 


ACTTATCTCT 


TTAGTGGTTC 


CTACACTGAG 


TTTAATAAGT 


TCTATGCCCC 


11880 


TGCACTGCTT 


CAAAAATATG 


TTATGTTGGA 


AAGCATAAAA 


CGTGGAATAC 


CTAAATACAA 


11940 


CTTCCTAGGC 


ATTCAAGGGA 


TTTTTGATGG 


AAGTGATGGT 


GTTTTGCGTT 


TTAAACAGAA 


12000 


TTTTAATGGC 


TATATTGTAC 


GCAAAGCAGG 


TACTTTCCGT 


TACCATCCAT 


CGCCTTTAAA 


12060 


ATACAAAGCT 


ATCCAGTTAC 


TCAAAAAAAT 


AGTAGGACGT 


TAAGATGAAA 


AAGTCAGTAT 


12120 


TTAGATTTCT 


TTTAGCTTCT 


TTTAGTAAAA 


TAATTCTTAT 


TTGCTAGAAA 


GGTGGAGAGA 


12180 


CATGCGCTGG 


CTTTTTCGTT 


TGATAGGGGC 


TTTCTTTTCT 


TTTGTGTGGC 


GTTTGTTTTG 


12240 


GCGTCTGGTT 


TGGATAGTTG 


TGCTCTTATG 


TGTGCTTGCT 


TTCGGACTTC 


TCTGGTATCT 


12300 


GAACGGAGAT 


TTTCAAGGAG 


CGCTAAAGCA 


AGCAGAACGG 


TCAGTAAAAA 


TTGGTCAACA 


12360 


AAGTATTGAC 


CAATGGGAGA 


AAACAGGGCA 


ACTGCCTAAG 


TTAAGCCAGA 


CAGATAGTCA 


12420 


CCAGCATTCT 


GAAGGAAGGT 


GGGCACAGGC 


CTCTGCTCGT 


ATTTACCTGG 


ATCCGCAGAT 


12480 


GGATTCACGC 


TTTCAAGAGG 


CTTATTTAGA 


AGCAATCCAG 


AACTGGAATC 


AAACTGGTGC 


12540 


TTTTAACTTT 


GAACTCGTGA 


CTGAGTCTAG 


TAAGGCGGAT 


ATTACGGCTA 


CGGAGATGAA 


12600 


CGACGGAGGC 


ACTCCTGTGG 


CAGGAGAGGC 


GGAAAGTCAA 


ACTAATCTCT 


TAACAGGGCA 


12660 


ATTCTTGTCC 


GTAACGGTGC 


GGTTGAATCA 


TTATTATTTG 


TCCAATCCAT 


ACTATGGCTA 


12720 


CTCCTATGAA 


CGCCTTGTCC 


ATACGGCAGA 


ACATGAGTTA 


GGTCATGCGA 


TTGGCTTGGA 


12780 


CCATACAGAT 


GAGAAGTCTG 


TCATGCAACC 


AGCAGGTTCC 


TTTTATGGTA 


TCCAGGAAGA 


12840 


GGATGTTGCA AACCTCCGAA AAATATATGA GACTAGTGAG TAGGGTACTA TCTTTCCCTA 


12900 


CTTTTTTTGC 


TATAATGGAA 


CTATGAACAA 


CTTGATTAAA 


TCAAAACTAG 


AGCTCTTGCC 


12960 


GACCAGCCCT 


GGTTGCTACA 


TTCATAAGGA 


TAAAAATGGC 


ACCATTATCT 


ATGTAGGAAA 


13020 


GGCTAAAAAT 


CTGCGTAATC 


GAGTACGGTC 


CTATTTTCGT 


GGAAGTCATG 


ATACCAAGAC 


13080 


AGAGGCTCTG 


GTGTCTGAAA 


TTGTGGATTT 


TGAATTTATT 


GTTACGGAGT 


CTAATATTGA 


13140 
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GGCACTTCTC 


CTAGAAATCA 


ACCTGATCAA 


GGAAAACAAG 


CCCAAGTACA 


ATATCATGCT 


13200 


CAAGGATGAC 


AAGTCCTATC 


CTTTCATCAA 


AATCACCAAT 


GAGCGCTATC 


CACGCTTGAT 


13260 


TATCACTCGT 


CAGGTCAAAA 


AGGACGGAGG 


TCTTTATTTT 


GGACCCTATC 


CCGATGTGGG 


13320 


GGCAGCCAAT 


GAAATCAAGC 


GGTTGCTGGA 


TCGGATATTC 


CCTTTTCGTA 


AGTGTACCAA 


13380 


CCCGCCCTCT 


AAGGTCTGTT 


TTTATTACCA 


TATCGGCCAG 


TGTATGGCCC 


ACACCATCTG 


13440 


TAAGAAGGAT 


GAGGCTTATT 


TCAAGTCTAT 


GGCCCAGGAG 


GTGTCTGATT 


TTCTGAAAGG 


13500 


TCAGGATGAC 


AAAATCATCG 


ATGATCTCAA 


GAGTAAAATG 


GCAGTAGCAG 


CACAAAGTAT 


13560 


GGAGTTTGAA 


CGTGCGGCGG 


AATACCGTGA 


CCTGATTCAG 


GCTATTGGAA 


CGCTTCGAAC 


13620 


CAAGCAACGG 


GTCATGGCGA 


AAGATTTGCA 


AAATCGCGAT 


GTCTTTGGCT 


ACTATGTGGA 


13680 


TAAGGGCTGG 


ATGTGTGTGC 


AGGTTTTCTT 


TGTCCGTCAG 


GtAAGCTCAT 


CGAGCGCGAT 


13740 


GTCAATCTCT 


TCCCCTACTT 


CAATGATCCA 


GATGAGGATT 


TTTTGACCTA 


TGTAGGACAA 


13800 


TTCTATCAAG 


AAAAATCTCA 


TCTAGTTCCC 


AATGAGGTAC 


TGATTCCGCA 


GATATTGACG 


13860 


AAGAAGCTGT 


CAAGGCTTTG 


GTGGATTCCA 


AGATTCTTAA 


GCCTCAACGT 


GGAGAGAAAA 


13920 


AACAACTGGT 


CAATCTAGCC 


ATAAAAAATG 


CTCGTGTTAG 


TCTAGAGCAG 


AAGTTCAATC 


13980 


TGCTAGAAAA 


ATCTGTCGAA 


AAGACTCAAG 


GAGCTATTGA 


AAATCTAGGG 


CGTTTGCTCC 


14040 


AAATCCCGAC 


CCCAGTACGT 


ATCGAGTCCT 


TCGATAACTC 


TAATATCATG 


GGAACTAGCC 


14100 


CTGTTTCGGC 


TATGGTGGTC 


TTTGTCAACG 


GTAAACCGAG 


TAAGAAGGAT 


TACCGTAAGT 


14160 


ACAAGATAAA 


AACGGTTGTT 


GGACCAGACG 


ACTATGCCAG 


CATGAGAGAG 


GTCATTCGCA 


14220 


GACGCTATGG 


TCGAGTACAG 


CGTGAGGCTT 


TGACTCCTCC 


AGATTTGATT 


GTGATTGATG 


14280 


GGGGGCAAGG 


TCAAGTCAAT 


ATCGCTAAGC 


AGGTTATCCA 


AGAGGAACTG 


GGCTTGGATA 


14340 


TTCCAATTGC 


TGGGCTGCAA 


AAGAATGATA 


AGCACCAAAC 


CCATGAATTG 


CTCTTTGGAG 


14400 


ATCCGCTTGA 


GGTGGTGGAT 


TTGTCTCGCA 


ATTCTCAGGA 


ATTTTTCCTC 


CTCCAACGCA 


14460 


TCCAAGATGA 


GGTGCACCGC 


TTTGCTATCA 


CTTTCCACCG 


CCAACTGCGC 


TCCAAAAATT 


14520 


CTTTCTCATC 


TCAATTGGAT 


GGGATTGACG 


GTCTGGGACC 


TAAACGCAAG 


CAGAATCTTA 


14580 


TGAAGCATTT 


CAAGTCTTTG 


ACCAAAATCA 


AGGAAGCCAG 


TGTGGATGAG 


ATTGTCGAAG 


14640 


TTGGGGTACC 


TAGAGTCGTT 


GCAGAGGCTG 


TGCAAAGAAA 


GTTGAACCCG 


CAGGGAGAAG 


14700 


CCTTGCCTCA 


AGTAGCAGAA 


GAAAGAGTAG 


ATTACCAAAC 


GGAAGGAAAC 


CACAATGAAC " 


14760 


CATAAAATCG 


CAATTTTATC 


AGATGTTCAT 


GGCAATGCGA 


CGGCGCTAGA 


AGCAGTGATT 


14820 


GCAGATGCTA 


AAAATCAAGG 


GGCCAGTGAA 


TATTGGCTTC 


TGGGAGATAT 


TTTTCTTCCT 


14880 


GGTCCAGGCG 


CAAATGACTT 


AGTCGCCCTG 


CTAAAGGACC 


TTCCTATCAC 


AGCAAGTGTT 


14940 
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CGAGGCAATT GGGATGATCG 


TGTCCTTGAG 


GCTTTAGATG 


GGCAATATGG 


CTTAGAAGAC 


15000 


CCACAGGAAG TTCAGCTCTT 


GCGTATGACA 


CAGTATTTGA 


TGGAGCGAAT 


GGATCCTGCA 


15060 


ACGATTGTCT GGCTACGAAG 


CTTGCCTTTG 


CTGGAAAAGA 


AAGAAATTGA 


CGGATTGCGC 


15120 


TTTTCTATCT CTCATAATTT 


ACCTGACAAA 


AACTATGGTG 


GTGACTTGCT 


AGTTGAGAAT 


15180 


GATACAGAGA AATTTGACCA 


ACTGCTAGAT 


GCGGAAACGG 


ACGTGGCAGT 


TTATGGTCAT 


15240 


GTTCACAAGC AGTTGCTTCG 


TTATGGAAGT 


CAAGGGCAAC 


AAATCATCAA 


TCCAGGGTCG 


15300 


ATTGGCATGC CCTATTTTAA 


TTGGGAGGCG 


TTAAAAAATC 


ACCGTTCCCA 


GTATGCCGTG 


15360 


ATAGAAGTTG AAGATGGGGA 


ATTACTCAAT 


ATCCAATTTC 


GTAAAGTTGC 


TTATGATTAC 


15420 


GAAGCTGAGT TAGAATTGGC CAAGTCCAAG GGGCTTCCCT 


TTATCGAAAT 


GTATGAAGAA 


15480 


CTGCGTCGTG ACGATAACTA 


TCAGGGGCAC 


AATCTGGAAT 


TATTAGCCAG 


CTTAATAGAA 


15540 


AAGCATGGGT ATGTAGAGGA 


TGTGAAGAAT 


TTTTTTGATT 


TTTTGTAAGA 


GTTTCCTAAA 


15600 


ATAGCCAATG CAAACTAAAA 


AAGCGATTTG 


CTGGTCCAAT 


CGCTTTTAGT 


ATATCTTATA 


15660 


CTCAATGAAA ATCAAAGAGC 


AAACTAGGAA 


GCTAGCCGTA 


GGTTGCTCAA 


AGCACAGCTT 


15720 


TGAGGTTGCA GATAAAGCTG 


ACGTGGTTTG 


AAGAGATTTT 


CGAAGAGTGT 


TATTGTAACT 


15780 


GAGATTGATC TGGGAGGTAA 


GAACCACCTA GATAGGTATT 


GCTGAGTTTT 


TCAAGGGTTC 


15840 


CGTCTTGATA GAGTTCTTTG 


AGCGCTTTAT 


CAAATTGCTC 


TTTAAACTCT 


TTTTGGTCGC 


15900 


TTGAGAAAAT GATATAATTG 


CTGGGGCTAT 


CTGCAGAAGG 


TAAATCAACG 


ACTGAGAGGT 


15960 


CTAAACCACG GTCCTTGATA 


ATCTTTTGAA 


CGGATACCTT 


GTCAAAAACT 


AGGAAATCAA 


16020 


ACTCTCCGTT AGCAAGGTCT 


AGGATTCGTT 


TACCAATATC 


CTCACCAGAA 


AAATTAATTG 


16080 


TAGCGGGATT ATCAGTGTGT 


TTCTGATTCC 


AGTTATTGAT 


GAATTGAGCG 


TTAGAAGTTC 


16140 


CGGTATCCTC TTGTGTTGTT 


TTACCAGCGA 


TCTGGTCAAG 


AGAAGTCAAA 


GGATTTTTCT 


16200 


TGTTGCTGAC AAGGACGAGG 


GGATTGTTGG 


AAATTGGAAG 


CGAGTAAAGG 


TATTTTTCAG 


16260 


CACGCTCTTT TGTGTAACTC 


AAGTTATTGG 


CCGCAGCCTG 


ATAGTGACCA 


GAATCAAGTC 


16320 


CTGGGAAGAT GCTCTCCCAG 


GCGGTTCTTT 


GGAATTGAAT 


CTCGTAGTCG 


CTGAGTTTTT 


16380 


CATCTACTGC CTTTAAAACT 


TCGATATCAA 


AGCCTGTCAG 


ATTGCCCTTG 


TCTTCGTAGT 


16440 


CAAATGGTGG CACGTCGCCA 


GCTGTAGCAA 


GGACGATTGT 


CTTTTGAGCG 


CTAGTCTCTT 


16500 


TGGGTGTAGC TTGATTCTCA 


CAGGCAACCA 


AAAATGGTAG 


GATAGCTAGT 


AATAGGCTAA 


16560 


ATTTTTTCAT ACTGTCTCCA 


TTCAAATGTA 


AAG 






16593 


(2) INFORMATION FOR SEQ ID NO: 53 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3510 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
GGGATATCCT TATATCCTTG TTCCTGGAAC CATTGTGGGA ATTGCTCAAC AGTTTTTTCA 
CCTTGAATTC CTGGTGCAAT GACAGTAAGA ATTTCGAAAT CACGATCTGG TTTCGCCGCT 
AGTTCCATCA ACTCTGGCAT ACTTTTCTTG CATGGACCAC ACCATGAAGC CCAAAACTTC 
AAGTAAACCT TTTTACCCTT AAAATCAGAT AACTTAACTT CTTTGCCATC CATGGATTGC 
AATGTGAAGT CTGGAGCATC TTTTCCAACA GCAATTTGTT GTACAGTCGT TTGTTGTTTT 
GGCTGTTGTG CTGCTTGAGT CTTTTTAGTT TCTTCCTCAC CACAGGCCAT CAATACAACT 
AATGACAAGA GACTTAAGCC AGCAAACATT ACTTTTTTCA TTTGTCCTCC TTTATTCAAA 
AATTCCAGCT AGAACATTTA CTTGTCCTAA TAGTAACAAA ATTCCCATTA AAACAATGAG 
GAAACCACCA ATTTTCTTTA GTAGCATCAT ATGACGCTTG ATTTTACTAA AATATGGCAT 
GACTAGACCT GAAGCTAGTG CCAATACCAA GAAAGGAAGG GCCATGCCaG AGTGTAAATG 
AGAGTATAAA TCGCTCCTTG CCAAGCGCCA TTGCCTCCAG AAGCCGCAAG TGCTAAAACA 
GAACTTAAAA CTGGACCAAT ACAAGGTGTC CAACCAAAGC TAAAGGTAAT ACCAAGTAAA 
AAAGCTGACC AATAACGATT AGAATCTGAT TTTTTAAAGG TAAAACTTTT TTGAACTTCT 
AATTTCTTCA AATGAAAAAT TTCCATCTGG TGAAGACCCA AAATGATAAT AATAGCTCCC 
ATGCCATATC. GAAACCAATT TGCATAGAGA ATATGACCAA AGTAACCAGC ACCAAAGCCT 
AGAATAAAGA AAATGAGAGA GATACCAGCG ATAAAGCAAA GTGTTCGAAT CAAGCCTGAC 
CAGAGAACCT TTCTCCCAAA CAAAGAAAAG CTTTTTGCAC TTTCTTGATC ATCCAATAAA 
ATCCCAGCAT AGACTGGCAG AAGAGGAAAA ATACAAGGAG AAAAAAAGGA TAAAACACCT 
GCTAGAAAAA CAGAGATTAA AAATACTATC GTTTCCAATA AAGAACCAAC TTTCTTAATA 
ATTCTAATCC TATTTTACTA TATTCAATTT TATTTGTAAG CTTTCTGCTA CGCAAAATCG 
TATCGGGCAC TATTGGACCA ATCTTTTCTT TTGCTAGTCA AGGCGGATCT TATCCCCCAA 
AATAGCCAAA AAGCAACGAC AAGGATTACT CATCGCTGCT TTTGTGAACG AAAATGTCTT 
TTAGGTCTGA CATTTCATAA ATCATGTTTT ACTTGAGTTT GTCAAGGATT GCTTTAAGCT 
CCTCTACTAG TTTAGTTTCT GTCTCTGCTG AGCCATTTTC TTCTTTCACG AAATCAAGGG 
TTTCTTGGAG AAGGTTTTGG GCTTTGGCAA GGACTTTTTT ATCCGCTTTT TCTGCATCTA 
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GCTGTCCTAG AACCTTGATC AATTCCGTGC TTAATTGCTG GATTTCTGAC TCTTTCTTAC 
GGCGAATCAG CCAGAAGGCA ATCACGCCTA GGAGGGCAAG TAGACTGACC ACAATCACTC 
CTGCOGGAAC TGAGTTTGTT TCACTCATCT TATCTGAATC CTTACTATCT TCCGTTCCTT 
GTTTTGCATC CTTCTTGTCC TGTGCAGGCT WCTOTeOCT AGCATTTGCT TTCACATCTT 
TGAGAGAGTC CAAGGCAGCC CAGCCTTCAC AGACTCTACT GCAGTATGCA GACCWACTC 
- TGTCAAGGCA CTATCTTCCG GAGCTTTTTG AGCATCTAGG AGGACAGCCT TGGTTGCATC 
GATTTTCGGA TCAGATACTG TTGCCAAAGC TTTCAAGCGT TGGTCTAACT CTTGACTCAA 
GGCACGAAGT TCAGACTTGT CAACTTGCTC TTGAGCTTGT GTGCTCGTTG AGCTAGCCGA 
AGCGCTTGCT ACCACTCTAG GATCTTGAGT CGGAGCTGAG CTTGGAGCTG GGACAGGGCT 
TGCAGGTTGA CTAGGAACAG TTATGGTATA TTGAAACTAG AATAGTACAT ATGGACTTCT 
AAAACATTGT TAGAATTCGA TTTTACTGTC CTGATCGATT TGTCCTATTC TTATTTCATT 
TTACTATAAT AACCGATGGT GTGGTTAATG TTGGTAAGAG AAACTTCTGA AACCAAGCTT 
CAAAAAAGTC GCTCGTCATC GTCTCTTOGT AAGTCATTGG AGCGATTAAT TCACCATTTG 
TTAGACCTGC AACCAAAGAA ATCCTCTGAT ATCTTCTTCC AGATACTTTG CCTCTTATTA 
ACTGACCTTT TAATGAGCGA CCATATTCTC GATAAAAATA AGTATCGAAT CCTGTTTCGT 
CAATCTAAAC AGGTGCTAGG TGCTTTAAAC TATTAAAATT CTTAAGAAAT AAGGCTACTT 
TTTCTGGGTC TTGTTCATAG TAGGTGTGGT TCTTTTTTTC GAGTGTAGCC CATAGCTTTG 
AGCGCATAGT GGATGGTAGT TGGATGACAG CCAAAJcTCAG AAGCTATTTC AGTCAAATAA 
GCrTCTGGAT TGTCAGTAAG ATAGTTTTTA AGTCTATCTC TATCAACTTT TCTTGGTTTT 
GTTCCTTTTA CTTGGTGGTT TAGCTCTCCT GTTTTCTCTT TTAGCTTTAA CCAGCCATAA 
ATGGTATTAC GTGAGATTTG GAAAACGTGT GATGCTTCTG TTATACTACC TATTCGCTCA 
CAATAAGAGA GAACTTTTTT ACGAAAATCT ATTGAATATG CCATAAGAAG ATTATACCAC 
ATTGTGTACT ATTTTTGGTT CATTTCACTA TAACACAAAA TAGATTATTA TTACATAACA 
AAAAAGAGGT CTAAACCTCT TAACTCAATT ACTCCGCCAG TAGGACTCGA ACCTACGACA 
TCATGATTAA CAGTCATGCG CTACTACCAA CTGAGCTATG GCGGATTAAA GCTAAGCGAC 
TTCCCTATCT CACAGGGGGC AACCCCCAAC TACTTCCGGC GTTCTAGGGC TTAACTTCTG 
TGTTCGGCAT GGGTACAGGT GTATCTCCTA GGCTATCGTC ACTTAACTCT GAGTAATACC 
TACTCAAAAT TGAATATCTA TTCAATTTAA GAAAACCGTT CGCTTTCATA TTCTCAGTTA 
CTTTGGATAA GTCCTCGAGC TATTAGTATT AGTCCGCTAC ATGTGTCGCC ACACTTCCAC 
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TTCTAACCTA TCTACCTGAT CATCTCTCAG GGCTCTTACT GATATATAAT CATGGGAAAT 
CTCATCTTGA GGTGGkTtCA CACTTAGATG CTTTCAGCGT TTATCCCTTC CCTACATAGC 
TACCCAGCGA TGCCTTTGGC AAGACAACTG GTACACCAGC GGTAAGTCCA CTCTGGTCCT 
CTCGTACTAG GAGCAGATCC TCTCAAATTT CCTACGCCCG CGACGGATAG GGACCGAACT 
GTCTCACGAC GTTCTGAACC CAGCTCGCGT 
(2) INFORMATION FOR SEQ ID NO: 54: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20986 base pairs 

(B) TYPE: nucleic acid 
(C> STRANDEDNESS : double 
<D> TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
CGGAGAAAAA CATGGCTAAG TCAAACTTTG AAAAAGTAGA ATCAGTTGTT GGCTGGGTTC 
GTGATAAGAA AATCACAGGC TACCGTATCT CTAAAGAAAC GAATGCGCGT GAAATGTCTA 
TCATTGCTCT GGCGCAGGGT CGTGCAAAAG TAAAAAATAT TTCATTTGAA ACAGCCCTAG 
GCCTAATTGA TTTCTATGAA AAAAATTATG AAAAATTTGA AGATTAATCT TTGGATAACG 
GCGGATTCTT GACCTTCAAG TAGTAGAGAT AGAGAATCTG CCTTTTCATT TTGAGGACAG 
CAAAAAGACT GCACGGTTGA TGCAGCCTTT TCTTTTTATT TGAGATAGCG TTGAAGGAAC 
TCTTTTGTTC GGTCTTCTTT AGGATTGGTG AAGAGGTCTT CTGGTTTACC TTCTTCAGCG 
ATCACGCCCT TATCCATAAA GATAACACGG TGAGAGACAT CACGGGCAAA TTCCATTTCA 
TGGGTTACGA CAATCATGGT CAAGCCTTCC TGAGCCAGGT CCTGCATGAT TTTGAGGACT 
TCTCCAACCA TTTCTGGATC GAGAGCTGAT GTTGGTTCAT CAAAGAGAAT AGCGTCCGGA 
TTCATGGAGA GGGCACGAGC GATGGCCACA CGTTGTTTTT GACCACCTGA GAGTTGTTTT 
GGTTTGGCTT GCCAGTAGCG TTCTCCCATG CCGACCTTTT CCAGGTTTTC TTTGGCAATC 
TTTTCAGCTT CTGTGCGTTC GCGTTTTAGG ACAGTTGTCT GAGCGACGAT TGTGTTTTCA 
AGAACATTGA GATTTTCAAA GAGGTTAAAG GATTGGAAAA CCATCCCCAA CTTTTCACGG 
TATTGCGTGA GGTCATAGCC TTTTTCGAGG ACGTTTTGTC CATGATAAAG GATTTGTCCA 
TCAGTTGGTG TTTCAAGTAG GTTAATGGAG CGTAGGAAGG TCGATTTTCC GCTTCCAGAG 
CTTCCGATGA TAGAGATGAC CTCTCCCTTG TGGACAGTGA GTGAAATGTC TTTTAGCACT 
TCGTTTTGTC CATAGGATTT TTTGAGGTGT TTAATTTCAA GGATTGCTTG TGTCATTATT 
TCAAATCCTC CGTTTGCATT TGGTTAGCAC CTGTAGTGTA GGTATCCATG TCCATTCTGC 
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GCTCGATAAA GCGTAGGATA 
TGATTGTAAA TGTCTGGAAG 
AAAGTTCGAC AACAGAGATA 
CATTACCAGT TGCAGGTAGG 
TCTGGTTATG GGTCATACCA 
GGATACCACC ACGGACGATT 
TAGCAGCCAG TGTACGGTCA 
CCATCGATTG AACAATCATT 
AGCCGACTAG TTTTTGTAGG 
AGACACCAAT GGCAAGTCCA 
GAGTGATACC AGCACCACGC 
GGCTAAAGAA ACTACTGCTA 
TCATACGATC CATCAAGGCA 
TTTGGCTAAT ACGATTGTCA 
CAGTTTTGAA ACCAGGTTCT 
TCAGTGCTTC TGGACGTTCA 
GCATTTGAGC GAAGTCTCCC 
AGTTATAAAG GTAGACCCCT 
ATTTAGCACT TGCGTAGGCA 
AACTGCTCGA AAAGGCAATT 
TCATGTCAAT CTTACCAGAA 
CCAAAGGTTC TTTACCTAAG 
TGGCATACTG ATTGGTCCCA 
AGTTAAAGGG AGCATATGCT 
CATTGACAAG TCCTAGCATC 
TGATTTCTCC TATTCTGATC 
TAATTTTTCA ACATAAGTAA 
AAAAAGGGGG CTTAGTTGAT 
CTTCTTCCAC TCTCTGTTPT 



CGTGTTACGG 
TATTGATAGG 
ACGTTCAATA 
ATGTTACGGA 
AGAGCAGTCG 
TCAGTCATGT 
AGGTTGATCC 
GGCGTACCAC 
CCGTAAATGA 
ATAATGAGAC 
AAGAGTTGTT 
GTCTCTTCAG 
ACTTGGTCAT 
TTTTTACGAA 
ACTTGAATCA 
GAAACATAAG 
ATGGCTGTTT 
TGTTGAGAAG 
GAATCTTTTT 
TCTTGTTTGC 
GTAAGGGCAG 
TCCTTAGCGA 
TCGATTTTGA 
GCTTCCATAC 
AGCAAGAGAC 
TATTAAAAAA 
GTCTTTACTT 
GAAAAAAACT 
TGCCATTGAT 



479 
TGAAGGTGAG 
TTTGTGTTGC 
CAGATGTATC 
CTACCTGAGG 
CAGCTTCAAA 
AGGCACCGGT 
CGAAAGCTTG 
GGAAAATTTC 
CTTTGTTTTC 
CTATGATGGT 
GCCAGTTTTC 
TTGTTGTAGC 
CTTTTGAAAT 
GCCCGATAGC 
TCTTGAACTT 
CATCAATGAC 
CTTTTTTAGC 
TGATTTTTGC 
TGACAAGCAA 
GTTCTGCAGT 
GGACTAGACC 
TTTTCTTGGC 
CAGCTCCGTT 
CGATGCGTAA 
TTGTGAAAAT 
TAACTGTCTC 
ACGAAAAAAT 
TTTTTCTTAC 
TTCAAGATAA 



GACAAAGTAA 
CACGGTATTT 
TTTGATATTG 
TAGGACAATC 
TTGTCCCTTG 
ATTGATTGAA 
GGCAGTTCCA 
AATGTAGACA 
AGAGAGAGGA 
TCCGACGATA 
AGAAAGAATT 
TTCGGCAGGT 
GGTTTCAATG 
GATAGCTGTA 
AGAGTTCGCA 
ACCAGCCTCA 
ACCTGGGATT 
ACCGTTAAAG 
AACTGGTTCG 
TGGACTCATA 
TTCCCACTTG 
GATTTGAACA 
GCTATCATCA 
ATATTCATCG 
AGATAAGTAy 
CTATTTTATC 

GCTATAATGA 
TGGTGTTAGG 
ACTCTTATCA 



ATCACGGCGA 
CCTGAGAAAT 
ATGACAAATT 
TTACGCATGG 
TCAACTGCTA 
ACGATGAAGA 
TAGTAGATAA 
TTGAGAACCC 
GCAGTACGGA 
GAGATTAAAA 
TTAGCAACTT 
TGTTCCTTGA 
CTGGCATTGA 
TCTTCTTCCC 
GCTTCAGCAG 
AGAGCTTGTC 
TGTGCAATCA 
TCATCCAAAG 
CTAGTATAGT 
CCTGCGATAA 
GTTTTAACAA 
TCGTATCCGT 
TCCTGGGTCC 
GCTTGAGCAA 
ATGTGGCTCA 
GAAAAATGCG 
TAAGAAAGAT 
CTTGTTTTGC 
AGGGGATTTG 
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TATATTCATG CAGACAATAC GGCAGAGTTT AGACAGAAGA TAGTTTACCA GTTTGAGGAG 2940 

GACTTTAAGG GCCAAATCGT GGGACTTGGA CGTGCTGGTA AGATGCCTAG CGGGTTTGAC 3000 

ATTGACCCTC ATCCAAAGAT TCAGGCCGCG AAAAACGGTG CAGAACTAGC AGATGTGACT 3060 

AGCGAAGTAA CAGAAGAAGC GGATGGTTAT ACTGTGAGAG TCTATAATCC AGGTCAGGAG 3120 

GGCGACATAG TTGAAGTTGA CCTCGTCTGG AACTTAAAAA ATTTACTTTT CCTTTATGAT 3180 

GATATCGCTG AATTAAATTG GCAACCTCTG ACAGATAGTT CAGAGTCTAT TGAAAAGTTT 3240 

GAATTTCATG TAAGGGGAGA CAAGGGGGCT GAAAAACTCT TTTTCCATAC AGGGAAACTT 3300 

TTTAGAGAGG GAACGATTGA AAAGAGTAAC CTTGATTATA CTATCCGTTT AGACAATCTT 3360 

CCGGCTAAGC GTGGAGTTGA GTTGCATGCC TATTGGCCTC GGACCGATTT TGCTAGCGCT 3420 

AGGGATCAGG GATTGAAAGG GAATCGTTTA GAAGAGTTTA ATAAGATAGA AGACTCGATT 3480 

GTTAGAGAAA AAGATCAGAG TAAACAACTC GTTACTTGGG TCCTCCCTTC GATCCTTTCC 3540 

ATCTCCTTGT TATTGAGTGT CTGCTTCTAT TTTATTTATA GAAGAAAGAC CACTCCTTCA 3600 

GTCAAATATG CCAAAAATCA TCGTCTCTAT GAACCACCAA TGGAATTAGA GCCTATGGTT 3660 

TTATCAGAAG CAGTCTACTC GACCTCCTTG GAGGAAGTGA GTCCCTTGGT CAAGGGAGCT 3720 

GGAAAATTCA CCTTTGATCA ACTTATTCAA GCTACCTTGC TAGATGTGAT AGACCGTGGG 3780 

AATGTCTCTA TCATTTCAGA AGGAGATGCA GTTGGTTTGA GGCTAGTAAA AGAAGATGGT 3840 

TTGTCAAGCT TTGAGAAAGA CTGCCTAAAT CTAGCTTTTT CAGGTAAAAA AGAAGAAACT 3900 

CTTTCCAATT TGTTTGCGGA TTACAAGGTA TCTGATAGTC TTTATCGTAG AGCCAAAGTT 3960 

TCTGATGAAA AACGGATTCA AGCAAGAGGG CTTCAACTCA AATCTTCTTT TGAAGAGGTA 4020 

TTGAACCAGA TGCAAGAAGG AGTGAGAAAA CGAGTTTCCT TCTGGGGGCT CCCAGATTAT 4080 

TATCGTCCTT TAACTGGTGG GGAAAAGGCC TTGCAAGTGG GTATGGGTGC CTTGACTATC 4140 

CTGCCCCTAT TTATCGGATT TGGTTTGTTC TTGTACAGTT TAGACGTTCA TGGCTATCTT 4200 

TACCTCCCTT TGCCAATACT TGGTTTTCTA GGGTTAGTTT TGTCTGTTTT CTATTATTGG 4260 

AAGCTTCGAC TAGATAATCG TGATGGTGTT CTAAATGAAG CGGGAGCTGA GGTCTACTAT 4320 

CTCTGGACCA GTTTTGAAAA TATGTTGCGT GAGATTGCAC GATTGGATCA GGCTGAACTG 4380 

GAAAGTATTG TGGTCTGGAA TCGCCTCTTG GTCTATGCGA CCTTATTTGG CTATGCGGAC 4440 

AAGGTTAGTC ATTTGATGAA GGTTCATCAG ATTCAAGTGG AAAATCCAGA TATCAATCTC 4500 

TATGTAGCTT ATGGCTGGCA CAGTACGTTT TATCATTCAA CAGCACAAAT GAGCCATTAT 4560 

GCTAGTGTCG CAAATACAGC AAGCACCTAC TCTGTATCTT CTGGAAGTGG AAGTTCTGGT 4620 

GGTGGCTTCT CTGGAGGCGG AGGTGGCGGC AGTATCGGTG CCTTTTAAAG AGAGCTACCA 4680 
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TAGACTGAAA 
ATAGTTTTAT 
CGGCAAAAAG 
ATTACATCAC 
GATAAGTTTT 
- AGTCTTAAGA 
GAGTGCCCAC 
ACCTGACTGA 
AAAGGCATGA 
GTAACCGTCG 
TTGTCAATGA 
CCGATAGAAC 
ATTTCTATCA 
AAGAGTTGGT 
ATATCAAGAC 
TATGTCTAGC 
TGGGGTCAGA 
TGCGAAGTGA 
TATATCCTTG 
GAGCGATGAG 
CTTTATTATT 
TAACATCGTG 
ATATAGTATT 
GATCAAACGT 
CCTTTCGGAA 
GTATACCACT 
ATGTATCTTA 
TGGTTGCCGA 
AATCAAAATG 



AAGTATGATA 
CTAAACTATT 
CCCTTGAAAA 
TTTTTGTTCG 
TTTGCAAGGT 
TAGGCCTTAG 
CGCAGATGAG 
TAAATAGAAG 
ATATTTCGAA 
TGATGACCGA 
GCCTCTTGTA 
GAGGTGCGAC 
GCTTATCAAA 
AGGTATATTC 
AACGAGTGTA 
CAGTATTTTT 
AAGAAGTTTA 
TAATGATTTG 
TTCATGCAGG 
ACAGTCTTGG 
TGAAAAAGTG 
TTTATTTTTA 
TCAAGTCTAC 
CTATGCGTTA 
ATCGTCAAGC 
TGGGCTTTGG 
TTGAAATTTT 
TTTCCAGTAC 
AAGCCTTTAT 



TAATGGAAGA 
TCTTATTTCA 
AGTCCATTTT 
TCAAATGGCA 
GGATGATGGC 
AAGCAGGTGA 
GGGAACCCCG 
AATCCAGTCC 
TCTCGGCTAA 
GTTGAACTCA 
ATGCTTGATG 
TGAGAGGATA 
TCCTGCCTCA 
TGAATGCTTT 
TTGTACTTTC 
AGGTCTACTT 
AGAGCGATGC 
GCAAATTCCT 
AAGTTCAGTA 
TTGATCTGTC 
AGTGGTTTAA 
GCGATATCAA 
TTGGTTATCC 
TCAAACTCAT 
GATTGGAGGA 
CAGTAGCTAA 
AAAATCTATC 
AGGTCACTTG 
GTCCATGTTT 



481 
TAGAAAAAAG 
ATTTGATGAT 
TTCAAAGGTA 
GCTCTTTTTT 
TACATTGTAA 
AAAGCGAGGG 
TTTGACCATT 
AGCGAAAGCT 
AATGACCGCC 
GCCATCGAGT 
ATTTCGAATT 
TCCTGAATTT 
ATCCTTTTCT 
CCAACGATTT 
CAATCAGACT 
GCCGATTATC 
CATGAGCGTC 
TGATGAGCAA 
GATTAAAGGC 
GAATAGACAG 
GAACAGTTTT 
TGCCTACATA 
ACGAATTTTT 
TACCAATTGA 
AATGAACTAA 
CTGCGCTAAA 
TTCTTCGGAA 
ATTTTAGCAG 
AATGTCGTGA 



ACAAACTATA 
TTGGCGATGA 
ATCCTGTGTT 
AGGATATAAA 
TGTTTTCCTT 
CATGCTTTGG 
CTTCCAGCTA 
TGTAATTGAG 
CTAAACGATC 
CATTGATACA 
CACGAGCAGG 
TAGAAGCGGT 
GAGGATTAGG 
TATCCAACTC 
GTTTTTCTTG 
GTGTTGAAAT 
TTTCTTATCC 
AGGATTGTAG 
ATAATGTCCA 
ATCTAAGAGT 
TCCTGGAACA 
AAGCATGGGA 
TGCCTTGTTA 
AACAAAAGCT 
TCCATAGTGG 
TATAATATAG 
TTGTTGAAGG 
AGGAATTCAT 
TTCAGCTTGG 



AGAAAAGTCA 
TTTTAGAGCA 
AATTTCAGAA 
ACAGGGTTCG 
ATTCTAACTT 
CAGCTTGTAT 
AATCAATCTG 
CAGGATTATC 
CCCAATCCCA 
TGTTTCCGCC 
AGATGTTGTT 
CAATCGCTTA 
GTAGCGTGTC 
AGGAAAGATG 
AGACGATGAA 
TGTTCACGAT 
GTTTTAGTCT 
GTGTAAACTT 
GTATCTTCAA 
TCAAAACCAG 
TTCAAGGCTG 
GTACCTCCSG 
CCTTAGACGA 
GTGGTTAGAG 
CTTATTCCAA 
GGAGTAATCT 
AATTACGGAA 
CCAATACCAA 
TGCTATTTTA 
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GCAGTTATGG 
GAAGTTCGTA 
GGTGTCTTTA 
CTCATGTTGA 
GCTATCGAGC 
GGACTCTTCC 
GGTGGTTTGT 
ATTCCTGTTA 
CTCTTGAGCT 
AGCATGGTGG 
GGTAAATACC 
GTATAAGAAA 
CAAACCGCGT 
CCTCCTAGTT 
TAGGCGGACA 
TATCTGACTA 
GTCCTCGTAG 
GAGAGCGACT 
AATCTTCCCG 
AGAGTTGTGA 
TATTAGGGTT 
AAGACGTTCT 
AAAATCCTCA 
CGCCATTTTA 
TTGCCAGATA 
ATTTTCTGTC 
GCGCTTAAAA 
GATAAAGACA 
ACCAAAAGCT 
AGGGTCGCTT 



TGATTTATTT 
AGACTTGGAG 
AATTTGATGA 
TTATCTACGG 
CAAGTGTAAC 
AAGTTCTTGC 
TAAATGGAAC 
TGTTTGGAGC 
TTGGGCAATT 
CTATTCGCTT 
GTATCGTGCT 
AACCTTGAAG 
CAGCTTTATC 
TGCTCTTTGA 
CCTCTTTCTT 
GCATCTTGTG 
CGGATTTTCA 
TTTTCTGATA 
TAGGTTTTCT 
ATGCCACGAG 
ACCTCAGGAA 
ACTGTCTTTT 
GCCTGTTCAG 
GCTAAGAATT 
TCTTTTTGAA 
ACATCCAAAT 
ATAGCTCGAA 
GCCTGGGGAC 
CTTGCCTCAT 
CCAATAATGA 



TAACAAGCTC 
ACTATGGTTG 
TTGGTTTGAT 
GGTTGCCTTC 
AGAGTTGGAC 
TCTTTTACCA 
CAGTCGTTCA 
TAGTGCCTTA 
GTTTTTGCTC 
CTTGACCAGC 
TGGTAGTGTT 
GGGCAACTCT 
TGCAACCTCA 
TTTTCATTGA 
TCTTGCTTAA 
TTTTTTGAGC 
AAATGACAAT 
GAAGAGTCAG 
CCTTGCCGAT 
CCTTTCGATA 
CTTCAAGTAA 
TTCCTACTCC 
GTAGAATCAC 
TGTTGTAAGA 
TGAGGCGAGC 
AGGCTTCGTC 
TCTGGAGTCC 
AACGTTCATA 
AACTACAGGT 
CAGGTTTTCC 



482 
AATCCTTTTA 



AAGGTCTTGA 
ACCCACTTCC 
ATCTATTTGG 
AAGCTTCCTT 
GGGACTAGCC 
GTTGTGACAG 
AAGATTTTCA 
TTGGTCGCGA 
TATGTGAAAA 
TTGCTACTTT 
TCAAGGTTTT 
AAACAGTGTT 
GCTTTAAAAT 
TTCTTCATAG 
AAGACTTTTT 
TTTTCCAGCT 
CTCTTTTTTG 
TGATTTACGG 
CAGATCATAG 
ATCAGCACCA 
ATGAAATTTG 
TGTCAAACCA 
AACGCCTGCG 
AATTTTGACC 
AATGCTCATG 
CACAGACTTG 
AGCTTCCTTG 
AGAAACGACT 
TCTGAGTTTA 



AACCGACCAA 
TTGCTACTTT 
ATAACATGGT 
AAAAGCGCAA 
ATACGACCGC 
GTTCAGGTGC 
AATTTACCTT 
AATTTGTGAA 
TGGGAGTAGC 
AACACGACTT 
ACAGTTTTGT 
ATACTCTTCG 
TTGAGCAGCn 
CCAGTCATGG 
AGTTGCAGGG 
CGTTTGGTAA 
TTTTCTTGTT 
ATATCTTCCT 
ATGCGATTGG 
CCTAGTCTAC 
GTAAAAACGC 
GAAATATCCA 
TGTGGTTTTT 
GAAGCAGTTA 
GCTGACTTGA 
GGTTCAATCA 
TATTTCTCAT 
GAACTCATGG 
CCCCGTCCAC 
GGATTATCCC 



GGACAAACAG 
ACCTTTACTT 
TTCAGTTGCT 
TAAAGCGCGT 
TTTCTATATC 
AACGATTGTC 
CTATCTTGGG 
AGCCGGAGAA 
TTTTGCGGTC 
CACCCTTTTT 
CCGTTTATTT 
AAAATCTCTT 
CTGCGGCTAG 
TAATCCCCAA 
CTATTTGGCT 
GAGTTGAAAA 
GATGTAGATT 
CAGCAAGGAG 
ATTTGACTGG 
CAAAACGGTC 
CCATTTGATG 
TTTGTTTGAG 
GATAATCACT 
GATGGAGTTC 
TACCGAGTTT 
AATCTGTATA 
AATTCCCTGA 
CAGAATGGAC 
CTGTTTGCCG 
TGATTTCCAC 
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TGCAGCAAAA 


AAGGCATCCA 


TGTCAATATG 


GATGATTTTT 


CTTGACAAAT 


CATTTAACAA 


8280 


AGGAAAAATC 


AACATGCCTA 


GCACCTTTTT 


ATACTCTTCG 


AAAATCTCTT 


CAAACCACGT 


8340 


CAGCTCTATm 


TGCAACCTCA 


AAACAGTGTT 


TTGAGCAATC 


TGCGGCTAGC 


TTCCTAGTTT 


8400 


GCTTTTCGAT 


TTCCATTGAG 


TGTTACTGCT 


TATTyTCTTT 


TATTATACCC 


TTTTTTCTGA 


8460 


AAAAAAGAAA 


AAAGGACTTT 


ATTTTTTCAA 


AAATATAATA 


CAGTTTGAAA 


TAAAATATAG 


8520 


ACTGTTTTAG 


AAAAGAAAGT 


GTAAAAATAG 


GGAATTTTCA 


CTTGTTGAAA 


TCGGTTACTA 


8580 


TATGGTATAC 


TTGTCTTATG 


AATGTAACAG 


ATGACTGTTA 


CTAGAAAAAA 


GAGGACATTA 


8640 


ATATGGTTGT 


TAAGACAGTT 


GTTGAAGCAC 


AAGATATTTT 


TGACAAAGCT 


TGGGAAGGCT 


8700 


TCAAAGGCGT AGATTGGAAA 


GAAAAAGCAA 


GTGTATCACG 


ATTTGTACAA 


GCTAACTACA 


8760 


CACCTTATGA 


TGGAGACGAA 


AGCTTCCTTG 


CAGGACCAAC 


AGAGCGTTCA 


CTTCACATCA 


8820 


AGAAAATTGT AGAAGAAACT AAAGCACACT ACGAAGAAAC TCGTTTCCCA 


ATGGACACTC 


8880 


GTCCAACATC 


TATCGCTGAT 


ATCCCTGCTG 


GATTTATCGA 


CAAAGAAAAT 


GAAGTTATCT 


8940 


TCGGTATCCA 


AAACGATGAA 


CTCTTCAAAT 


TGAACTTCAT 


GCCAAAAGGT 


GGTATCCGTA 


9000 


TGGCTGAAAC 


TACTTTGAAA 


GAAAATGGAT 


ACGAACCAGA 


CCCAGCTGTT 


CACGAAATCT 


9060 


TCACTAAATA 


TGTAACAACA 


GTTAACGACG 


GTATTTTCCG 


TGCCTACACT 


TCAAATATTC 


9120 


GTCGCGCTCG 


TCACGCACAC 


ACTGTAACTG 


GTCTTCCAGA 


TGCATACTCA 


CGCGGACGTA 


9180 


TCATCGGTGT 


TTACGCACGT 


CTTGCTCTTT 


ACGGTGCAGA 


CTACTTGATG 


CAAGAAAAAG 


9240 


TAAATGACTG 


GAATGCAATC 


AAAGAAATCG 


ATGAAGAAAC 


AATCCGTCTT 


CGTGAAGAAG 


9300 


TAAACCTTCA ATACCAAGCA 


TTGCAACAAG 


TTGTTCGCCT 


GGGTGACCTT 


TACGGGGTTG 


9360 


ATGTTCGCAA 


ACCAGCGATG 


AACGTGAAAG 


AAGCAATCCA 


ATGGGTTAAC 


ATTGCTTTCA 


9420 


TGGCTGTCTG 


CCGTGTGATT 


AACGGTGCTG 


CTACATCTCT 


AGGTCGTGTA 


CCAATCGTAT 


9480 


TGGACATCTT 


TGCAGAACGT 


GACCTTGCTC 


GTGGTACATT 


TACTGAATCA 


GAAATCCAAG 


9540 


AATTCGTTGA 


TGATTTCGTT 


ATGAAACTTC 


GTACAGTTAA ATTTGCTCGT 


ACAAAAGCTT 


9600 


ATGACCAATT 


GTACTCAGGT 


GACCCAACCT 


TTATCACAAC 


TTCTATGGCT 


GGTATGGGTA 


9660 


ACGACGGTCG 


TCACCGTGTT 


ACTAAGATGG 


ACTACCGTTT 


CTTGAACACT 


CTTGACAACA 


9720 


TCGGTAACTC 


ACCAGAACCA 


AACTTGACAG 


TTCTTTGGAC 


TGACAAATTG 


CCATACAACT 


9780 


TCCGTCGCTA 


CTGTATGCAC 


ATGAGCCACA 


AACACTCTTC 


TATCCAATAC 


GAAGGTGTAA 


9840 


CAACAATGGC 


TAAAGACGGA 


TATGGTGAAA 


TGAGCTGTAT 


CTCATGCTGT 


GTGTCTCCAC 


9900 


TTGATCCAGA 


AAATGAAGAA 


CAACGCCACA 


ACATCCAGTA 


CTTCGGTGCT 


CGTGTAAACG 


9960 
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TTCTTAAAGC CCTTCTTACT GGTTTGAATG GTGGTTACGA CGATGTTCAC AAAGACTACA 10020 

AAGTATTTGA TATCGAACCA ATCCGTGACG AAGTTCTTGA ATTTGAATCA GTTAAAGCGA 10080 

ACTTTGAAAA ATCTCTTGAC TGGTTGACTG ACACTTACGT AGATGCCTTG AACATCATCC 10140 

ACTACATGAC TGATAGGTAC AACTACGAAG CTGTTCAAAT GGCCTTCTTG CCAACTAAAC 10200 

AACGTGCCAA CATGGGATTC GGTATCTGTG GATTTGCTAA CACTGTTGAT ACATTGTCAG 10260 

CTATCAAATA CGCTACAGTT AAACCAATCC GTGACGAAGA TGGCTACATC TACGATTACG 10320 

AAACAATCGG TGACTACCCA CGCTGGGGTG AAGATGACCC ACGTTCAAAC GAATTGGCAG 10380 

AATGGTTGAT CGAAGCTTAC ACAACTCGTC TACGTAGCCA CAAACTATAC AAAGACGCAG 10440 

AAGCTACAGT ATCACTTTTG ACAATCACAT CTAACGTTGC TTACTCTAAA CAAACTGGTA 10500 

ACTCACCAGT TCACAAAGGT GTATACCTCA ACGAAGATGG TTCTGTGAAC TTGTCTAAAC 10560 

TTGAATTCTT CTCACCAGGT GCTAACCCAT CTAACAAAGC TAAAGGTGGT TGGTTGCAAA 10620 

ACTTGAACTC ACTTTCTAGC CTTGACTTTA GTTATGCAGC TGACGGTATC TCATTGACTA 10680 

CACAAGTATC ACCTCGCGCT CTTGGTAAGA CTCGTGATGA ACAAGTTGAT AACTTGGTAA 10740 

CAATTCTTGA TGGTTACTTC GAAAACGGTG GACAACACGT TAACTTGAAC GTTATGGACT 10800 

TGAACGATGT TTACGAAAAA ATCATGTCAG GCGAAGACGT TATCGTACGT ATCTCTGGAT 10860 

ACTGTGTAAA CACTAAATAC CTCACTCCAG AACAAAAAAC TGAATTGACA CAACGTGTCT 10920 

TCCACGAAGT TCTTTCAATG GATGACGCCT TGGATGCATT GAGCTAATCA AGTTCTTGAA 10980 

TAATAAAAAG GAACCCTCGG TCAAACGACT GAGGGTTTTG TGCTTGGGAT AGTATGAGCA 11040 

ATTCCTTCGG CGCAATATGC AATGTTTTTG GGCTCTTTGT CAACTGTAGT GGGTTGAAAA 11100 

AAAGCTAAGC TTGAGAAAGG ACAAATTTCG TCCTTTCTTT TTTGATGTTC AGGGCGATAA 11160 

AAATCCGTTT TTTGAAGTTT TCAAAGTTCC GAAAACCAAA GGCATTGCGC TTGATGTCTT 11220 

TGATGAGTTT GTTAGTGGCC TCAAGTTTAG CGTTAGAATA AGGCAATTCA ATGGCGTTAG 11280 

TGATGTAGTT TTTATAGCAA ATAAATGTGC TCAAAGTGGT TTTAAAGGTG CGGTTGAGAT H340 

GAGGTAACGT GTCTTGAATT AAGCCCCAAA ACTGGTCAGT ATTCTTCTCT TGTAGATGAA 11400 

ATAGGAGTAG TTGATACAGG TCATAGTAAT CTTTAAGTTC AGGTACTAGA GTAAAGATTT 11460 

TCTTCAGACA CTCCCTAGGA GTTAAGGTCT CTCTGAAAGT TCTAGCATAG AAAGGCTTAA 11520 

GAGAGAGTTT CCGACTATCT TTTAGGATAA ATTTCCAGTA ATATTTAAGA GCTCTGTATT 11580 

CCAGAGATTT ATCATCAAAT TGCTTCATGA TGTTGATTCT AGTCTGATTA AGAGCCCTGC 11640 

TCATGTGTTG GACAATGTGG AAACGATCGA GAACAATTTT AGCATTGGGA AATAATTTCT 11700 

TAATGAGAGG GATATAACTT CCAGACATAT CAACAGTGAC GACTTTAACT TTTTTTCTAG 11760 
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CTTCTTTCGA GTACTTGAAG AAATGATTTC GGATGGTTGT TTGACGTCTG TTATCAAGAA 11820 

TGGTCATGAT TTTCTTAGTG TTGAAATCCT GAGCAATGAA AGCCAATTTC CCCTTCTGGT 11880 

AGGAGAATTC ATCCCAGGAG AGGATTTCAG GCAAAGTGGT GTAATCCTCT TGGAAATGAA 11940 

ATTGCTTGAG CTTACGATAG ACGGTAGAGG TAGAGGTAGA GGTAGAGATG GCTAATTTAG 12000 

AAGCGATATG TGTAAGAGCC TCTCTGTTGA GTAGGAGTTG GGCAATTTTC TGTCTCACCA 12060 

TTTCCGAGAT TTGGCAATTT TTCTGAACGA GAGTTGTTTC AGCTACAGTG ACTTTCCGAC 12120 

AGGACTTGCA TTGAAATCCT CTCTTTTTCA AATGAATGAG GCTAGGGAAA CCACCAATCT 12180 

CGATAAAAGG GATTTTAGAA GGCTTTTGGA AGTCGTATTT GATTTGTTTT CCTTTACAGT 12240 

GTTTACATTT AGGTGGGTGA TAATCAAGTG TAGCGAAGAC TTCGATATGG GTATCGTGCT 12300 

GAATGGCTTT ATTTAAGGTG ATGTTTTTGT CTTTTATTCC GATGAGTAAT GTGGTATGAT 12360 

TGATGTGTTC CATAAGATAC TTTCTAATGA GTTGTTTAGG CGCTTTTCAT TATAAGTCTT 12420 

ATGGGACTTT TTTGATACTC AAAAAGCCCT ATAATCTCCA CAGTGGGATT TACCCACTAC 12480 

AGAAATTATA GAGCCAGAAA AAACACTTTT GTTCACTAGC AGAAACTAGA GAGCAGAAGT 12540 

GTTTTTCTGT TCAGATTTAC CCAAAACTGG GAAATATGGG GATAAGAATA GAGATGGCTT 12600 

AGGAAGCCCC TTTTTGTGTG TAGACAGTAC GATGAACTTA TAACAAATAG TGAGCCTTTT 12660 

TAGCAATCAT TGCGACCCGT TTGTCAAAAG CCTCTTTTCG GATATCTACA ATTGTCTGAT 12720 

AGATGAGACG CTGTTGGCTA ACATGCAAAT CTAAGGCAAT CGTCAAAAAG TGATGTTTCC 12780 

CTTTGGGATA CTGCTTTTTA ACGTAAGGCA GGTATTCTTT CGTTGTAATA ATAATCAATG 12840 

GCTCTGTCAA ATGCTCCTCT GAAGGAGGAG GACTAATTAG AATATTGTAT CCTGTAACAG 12900 

AGGCAACTTT GTCAGTAAAA TTCCGTAAAA TAATGGACTT TATTAAGTTT ACATCTGCTT 12960 

GATTATTTAA AATGATAAAA ATCGGGATAG CAGGTAGTGA GGAAAAGATG GTTTCTGTCA 13020 

AGTAGAGTGA GAAAAGGTAC AGCCGATGCT GGTCGATAAC TCCTTCAATC TTCTGCTCAG 13080 

TCATCCACTC TTGAACAATT GCTTTCGAAA TATGATACAG TGGCTTGTCG CTTTCAATCC 13140 

CATAATGTTC GTAATAATTA TAATAGGGAA CTAGATTTTG TAAACCAAAC AAAAACGTTC 13200 

TTGTTAAGAA AGTCAGTGCT GTTAAAAAAG AAAGAGAATT CGAAATGTCA TTTCCTAAGA 13260 

TATTCTTGAA CTTGGATAGT AGATGCTTTC CTCTTGTATG CTGAAGAATC AGTTGAATAG 13320 

TATGAGTCTT TTTTTCTTGA TTCCATTTGT CCTTGGAAAA CGAAGAATTA GCAGAACAAT 13380 

AAACCAAAAA GATATAATCC AGTTCTTCCT GAGTAAAAGT CATGTTGGCA TGTGGCTCTA 134 40 

AGTAAGTTTG GCAATGTTCC ATCAAAATCG GATACATAAA GAGGTTTTTT AATTTTTCAA 13500 
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ACTCTTTGGA CTCAGGGAAC TCAAGTGGAA ATTCCCGACG TTTCCAAGTG AGTGCCACTA 13560 

GTATGCTAAA ATGAACATAC TCGTCAGGTG TGATTTCTAA CAGTTCATGA CTGAGTTGAG 13620 

AATTAGACTG CACAATCATA TGTGTGACCC AATCCATACT TCCATCATTC AAATCATAAA 13 680 

TCTCAATACC AAAATGAAAC TGGAGGAGTG CAATTAAAAA ACGAATGCGA TATTCAGGAC 13740 

CAACTACTTG ATTTTTCACA AGGTCCAAAC CTACTGAACG TAGTAACAAG CCACACTTTT 13800 

GTCGTACGCG GTAGCCTGTT GCGATGGAAA TATACTCTTT TTGTGTAAAT TCGTTAAAGC 13860 

TTTGATTACC TTGTAGTAGA AAGAAGCGGA GTATTTTTAA AATAGTTGAT TGGTTATAAA 13920 

GCTGATGGAA GTAATAATTC GTTTGATGAG AATGGTGTTC GATTAATTGA ACTTGTTGCG 13980 

TATCTAAATT AAATGTCAAC TCTTCCTCGA ATGTTTCTTG TAATTCCTGC AAAATGCTTA 14040 

GGAGACTTTT AGATTGTAAT GAAGTTAAAG TAGACAGTTC ATCTAGTTCA ATAGACCGAA 14100 

TATCCAATAA TATATTTAAA ATGGTAATTT TATCTGTAAT TCTTTTTTCA ATGTATTTGT 14160 

TTAGCATAGT TACCGAATCT TAGTTGCATA TAGATAATTT TAATTATTAT AATACAAAAG 14220 

AAACTAATTG TCTTGTCAAA AAGGTTGTGG AATTTCCGAC TTTATTGATA AAACAGCATG 14280 

TAATAAAAGG CATTTTAAAG ATAGTAATGA GTATTGGTGG AGTTTTATGG CTTATTTTTT 14340 

TTATTAGAAA ATATTTTTTT ATCAAATATT GTCGTTCTAT AAAAAAATAT GTGATAAAAA 14400 

TATCTATTGT GATGGAAGTT GTTTTAATTT ATACTAGGAT AGTTAATAGT AATACTATAC 14460 

TATACTATAT TGTATACAAG TGTGTCATTG CCAGGTTGAG AAGATAGCTA TAACGCACTT 14520 

TTATACGCTT TTGCTACGTT TGTTAGTGAA CGGATTAACT CAGTGAGATA AATTTTATCA 14580 

GAACATAAGT AATCCGTTTC TTCGTGTATA CAGATTGAAA GTACCTATGA ATCATAGAAG 14640 

GATTAACTTG TTCTATGAAT AATGCTTAAC AGGGAGACAC ACATGAAAAA AGTAAGAAAG 14700 

ATATTTCAGA AGGCAGTTGC AGGACTGTGC TGTATATCTC AGTTGACAGC TTTTTCTTCG 14760 

ATAGTTGCTT TAGCAGAAAC GCCTGAAACC AGTCCAGCGA TAGGAAAAGT AGTGATTAAC 14 820 

GAGACAGGCG AAGGAGGAGC GCTTCTAGGA GATGCCGTCT TTGAGTTGAA AAACAATACG 14 880 

GATGGCACAA CTGTTTCGCA AAGGACAGAG GCGCAAACAG GAGAAGCGAT ATTTTCAAAC 14940 

ATAAAACCTG GGACATACAC CTTGACAGAA GCCCAACCTC CAGTTGGTTA TAAACCCTCT 15000 

ACTAAACAAT GGACTGTTGA AGTTGAGAAG AATGGTCGGA CGACTGTCCA AGGTGAACAG 15060 

GTAGAAAATC GAGAAGAGGC TCTATCTGAC CAGTATCCAC AAACAGGGAC TTATCCAGAT 15120 

GTTCAAACAC CTTATCAGAT TATTAAGGTA GATGGTTCGG AAAAAAACGG ACAGCACAAG 15180 

GCGTTGAATC CGAATCCATA TGAACGTGTG ATTCCAGAAG GTACACTTTC AAAGAGAATT 15240 

TATCAAGTGA ATAATTTGGA TGATAACCAA TATGGAATCG AATTGACGGT TAGTGGGAAA 15300 
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ACAGTGTATG 


AACAAAAAGA TAAGTCTGTG CCGCTGGATG 


TCGTTATCTT 


GCTCGATAAC 


15360 


TCAAATAGTA 


TGAGTAACAT TCGAAACAAG AATGCTCGAC 


GTGCGGAAAG 


AGCTGGTGAG 


15420 


GCGACACGTT 


CTCTTATTGA TAAAATTACA TCTGATTCAG 


AAAATAGGGT 


AGCGCTTGTG 


15480 


ACTTATGCTT 


CCACTATCTT TGATGGGACC GAGTTTACAG 


TAGAAAAAGG 


GGTAGCAGAT 


15540 


AAAAACGGAA 


AGCGATTGAA TGATTCTCTT TTTTGGAATT 


ATGATCAGAC 


GAGTTTTACA 


15600 


ACCAATACCA AAGATTATAG TTATTTAAAG CTGACTAATG 


ATAAGAATGA 


CATTGTAGAA 


15660 


I 1AAAAAATA 


AGGTACCTAC CGAGGCAGAA GACCATGATG 


GAAATAGATT 


GATGTACCAA 


15720 


TTCGGTGCCA 


CTTTTACTCA GAAAGCTTTG ATGAAGGCAG ATGAGATTTT GACACAACAA 


15780 


GCGAGACAAA 


ATAGTCAAAA AGTCATTTTC CATATTACGG 


ATGGTGTCCC 


AACTATGTCG 


15840 


TATCCGATTA 


ATTTTAATCA TGCTACGTTT GCTCCATCAT 


ATCAAAATCA ACTAAATGCA 


15900 


TTTTTTAGTA 


AATCTCCTAA TAAAGATGGA ATACTATTAA 


GTGATTTTAT 


TACGCAAGCA 


15960 


AL I AOTGGAG 


AACATACAAT TGTACGCGGA GATGGGCAAA 


GTTACCAGAT 


GTTTACAGAT 


16020 


AHWiLAlj 111 


ATGAAAAAGG TGCTCCTGCA GCTTTCCCAG 


TTAAACCTGA 


AAAATATTCT 


16080 


GAAATGAAGG 


CGGCTGGTTA TGCAGTTATA GGCGATCCAA 


TTAATGGTGG 


ATATATTTGG 


16140 


CTTAATTGGA 


GAGAGAGTAT TCTGGCTTAT CCGTTTAATT 


CTAATACTGC 


TAAAATTACC 


16200 


AATCATGGTG 


ACCCTACAAG ATGGTACTAT AACGGGAATA 


TTGCTCCTGA 


TGGGTATGAT 


16260 


GTCTTTACGG 


TAGGTATTGG TATTAACGGA GATCCTGGTA 


CGGATGAAGC 


AACGGCTACT 


16320 


AGTTTTATGC 


AAAGTATTTC TAGTAAACCT GAAAACTATA 


CCAATGTTAC 


TGACACGACA 


16380 


AAAATATTGG 


AACAGTTGAA TCGTTATTTC CACACCATCG 


TAACTGAAAA 


GAAATCAATT 


16440 


GAGAATGGTA 


CGATTACAGA TCCGATGGGT GAGTTAATTG 


ATTTGCAATT 


GGGCACAGAT 


16500 


GGAAGATTTG 


ATCCAGCAGA TTACACTTTA ACTGCAAACG 


ATGGTAGTCG 


CTTGGAGAAT 


16560 


GGACAAGCTG 


TAGGTGGTCC ACAAAATGAT GGTGGTTTGT TAAAAAATGC AAAAGTGCTC 


16620 


TATGATACGA 


CTGAGAAAAG GATTCGTGTA ACAGGTCTGT 


ACCTTGGAAC 


GGATGAAAAA 


16680 


GTTACGTTGA 


CCTACAATGT TCGTTTGAAT GATGAGTTTG 


TAAGCAATAA 


ATTTTATGAT 


16740 


ACCAATGGTC 


GAACAACCTT ACATCCTAAG GAAGTAGAAC 


AGAACACAGT 


GCGCGACTTC 


16800 


CCGATTCCTA 


AGATTCGTGA TGTGCGGAAG TATCCAGAAA 


TCACAATTTC 


AAAAGAGAAA . 


16860 


AAACTTGGTG 


ACATTGAGTT TATTAAGGTC AATAAAAATG 


ATAAAAAACC 


ACTGAGAGGT 


16920 


GCGGTCTTTA GTCTTCAAAA ACAACATCCG GATTATCCAG 


ATATTTATGG 


AGCTATTGAT 


16980 


CAAAATGGCA 


CTTATCAAAA TGTGAGAACA GGTGAAGATG 


GTAAGTTGAC 


CTTTAAAAAT 


17040 
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CTGTCAGATG 


GGAAATATCG 


ATTATTTGAA 


AATTCTGAAC 


CAGCTGGTTA 


TAAACCCGTT 


17100 


CAAAATAAGC 


CTATCGTTGC 


CTTCCAAATA 


GTAAATGGAG 


AAGTCAGAGA 


TGTGACTTCA 


17160 


ATCGTTCCAC 


AAGATATACC 


AGCGGGTTAC 


GAGTTTACGA 


ATGATAAGCA 


CTATATTACC 


17220 


AATGAACCTA 


TTCCTCCAAA 


GAGAGAATAT 


CCTCGAACTG 


GTGGTATCGG 


AATGTTGCCA 


17280 


TTCTATCTGA 


TAGGTTGCAT 


GATGATGGGA 


GGAGTTCTAT 


TATACACACG 


GAAACATCCG 


17340 


TAAAGTGTAG 


AAATGATAAT 


ATCTATGTTC 


TGAACGATAC 


TTTTAAGAAG 


TAGCACTCAA 


17400 


GAAGAGATTT AAGTTTACTT 


GGTGAAACCT 


GTTTTATTCG 


TAAGTAAACT 


ATCATTGAAA 


17460 


GGGGAGATGT 


TTTCGAAAAC 


TTGCACAGAA 


AAAGGATTAT 


TATTGTCATG 


TGTAATTCAT 


17520 


TACATTGCTC 


ACAGTTGATT 


TTAAGAGATA 


TGAATAAGGA 


GAAATCATGA 


AATCAATCAA 


17580 


CAAATTTTTA 


ACAATGCTTG 


CTGCCTTATT 


ACTGACAGCG 


AGTAGCCTGT 


TTTCAGCTGC 


17640 


AACAGTTTTT 


GCGGCTGGGA 


CGACAACAAC 


ATCTGTTACC 


GTTCATAAAC 


TATTGGCAAC 


17700 


AGATGGGGAT 


ATGGATAAAA 


TTGCAAATGA 


GTTAGAAACA 


GGTAACTATG 


CTGGTAATAA 


17760 


AGTGGGTGTT 


CTACCTGCAA 


ATGCAAAAGA 


AATTGCCGGT 


GTTATGTTCG 


TTTGGACAAA 


17820 


TACTAATAAT 


GAAATTATTG 


ATGAAAATGG 


CCAAACTCTA 


GGAGTGAATA 


TTGATCCACA 


17880 


AACATTTAAA 


CTCTCAGGGG 


CAATGCCGGC 


AACTGCAATG 


AAAAAATTAA 


CAGAAGCTGA 


17940 


AGGAGCTAAA 


TTTAACACGG 


CAAATTTACC 


AGCTGCTAAG 


TATAAAATTT 


ATGAAATTCA 


18000 


CAGTTTATCA 


ACTTATGTCG 


GTGAAGATGG 


AGCAACCTTA 


ACAGGTTCTA 


AAGCAGTTCC 


18060 


AATTGAAATT 


GAATTACCAT 


TGAACGATGT 


TGTGGATGCG 


CATGTGTATC 


CAAAAAATAC 


18120 


AGAAGCAAAG 


CCAAAAATTG 


ATAAAGATTT 


CAAAGGTAAA 


GCAAATCCAG 


ATACACCACG 


18180 


TGTAGATAAA 


GATACACCTG 


TGAACCACCA 


AGTTGGAGAT 


GTTGTAGAGT 


ACGAAATTGT 


18240 


TACAAAAATT 


CCAGCACTTG 


CTAATTATGC 


AACAGCAAAC 


TGGAGCGATA 


GAATGACTGA 


18300 


AGGTTTGGCA 


TTCAACAAAG 


GTACAGTGAA 


AGTAACTGTT 


GATGATGTTG 


CACTTGAAGC 


18360 


AGGTGATTAT 


GCTCTAACAG 


AAGTAGCAAC 


TGGTTTTGAT 


TTGAAATTAA 


CAGATGCTGG 


18420 


TTT AG C T AAA 


GTGAATGACC 


AAAACGCTGA 


AAAAACTGTG 


AAAATCACTT 


ATTCGGCAAC 


18480 


ATTGAATGAC 


AAAGCAATTG 


TAGAAGTACC 


AGAATCTAAT 


GATGTAACAT 


TTAACTATGG 


18540 


TAATAATCCA 


GATCACGGGA 


ATACTCCAAA 


GCCGAATAAG 


CCAAATGAAA 


ACGGCGATTT 


18600 


GACATTGACC 


AAGACATGGG 


TTGATGCTAC 


AGGTGCACCA 


ATTCCGGCTG 


GAGCTGAAGC 


18660 


AACGTTCGAT 


TTGGTTAATG 


CTCAGACTGG 


TAAAGTTGTA 


CAAACTGTAA 


CTTTGACAAC 


18720 


AGACAAAAAT 


ACAGTTACTG 


TTAACGGATT 


GGATAAAAAT 


ACAGAATATA AATTCGTTGA 


18780 


ACGTAGTATA 


AAAGGGTATT 


CAGCAGATTA 


TCAAGAAATC 


ACTACAGCTG 


GAGAAATTGC 


18840 
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TGTCAAGAAC 
TACATATGGT 
ATTTGTAATT 
GAGTCAAGAA 
TGCTTATAAC 
AGCTCAAGCT 
AGATAAGGAC 
TACAGGCCTT 
ATTACTAACT 
AGGCATTGAG 
AATCACTATC 
GATTATGGGT 
TTAAGTAAGA 
GTATCTTCTT 
CGCAAGAAGA 
TGCCATCTCG 
ATGATCGGGT 
TCAAAAAGAC 
CAAATGGTCT 
AATTTCTTTT 
CAGATACAAT 
AGGGTGTCGG 
TGATTGGAGA 
AAAATGGAGA 
AGCCACTGGC 
AGCTGGTGAC 
AGGTGGATGG 
AAAGCGGACA 
AAGATGGTCG 



TGGAAAGACG 
AAAAAGTTTG 
GCAAATGCTG 
GAGAAGCAGT 
GCTCTTACTG 
GCTTATAATG 
AATGAAAATG 
CTTGCAGGTA 
AGCCGTCAGA 
TATACTGCTG 
CCACAAACGG 
ATTGCAGTGT 
GAGAAAGGAG 
TGTTATGGCT 
TCACACGTTG 
TGATGGTCAT 
GCAAATTGTA 
TTCGTTTGAG 
TTACTATGTT 
TGAAATGACA 
GACAACAAAG 
CTTTAAATTG 
ATACCGTTAC 
GATTTTTGTG 
AGGCTATGCT 
GATTACGGTT 
TCGGACCAAT 
CTATACTCCT 
TTTCCGAGTG 



AAAATCCAAA 
TCAAAGTTAA 
ATAATGCTGG 
TGGTTGTTAC 
CACAACAACA 
CTGCTGTGAT 
TTGTGAAATT 
CATATTACTT 
AATTTGAAGT 
GTTCAGGTAA 
GTGGTATTGG 
ACGCATATGT 
CCATTGATGA 
CTGTGTTTTT 
GTCTTGCAAT 
CGGTTGCAAG 
AGAGACTTGC 
ATGACCTTCC 
CGCTCTATTA 
GATCAAACGG 
GTGAAGCTGA 
GTATCAGTAG 
AGTTCTTCTG 
ACAAATCTTC 
GTTACGACGC 
GTCAATCAGA 
ACCTCTCTTC 
GTTCTTCAAA 
GAAGGTCTAG 
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ACCACTTGAT 
TGATAAAGAT 
TCAATATTTA 
AACAAAGGAT 
AACTCAGCAA 
TGCTGCCAAC 
AGTTTCTGAT 
AGAAGAAACA 
CACTGCAACT 
AGATGACGCT 
TACAATTATC 
TAAAAACAAC 
CAATGCAGAA 
CTCTTGTATG 
TGGAGAACFA 
TATGGAAGTT 
ATTCGTGGGA 
TTGAGAATCA 
TCCAGACGGA 
TAGAGCCTTT 
TAAAGGTGGA 
CAAGAGATGT 
GTCAAGTAGG 
CTCTTGGGAA 
TGGATACGGA 
AATTACCACG 
AAGGGGCAAT 
ATGGTAAGGA 
AGTATGGGAC 



CCAACAGAGC 
AATCGTTTAG 
GCACGTAAAG 
GCTTTAGATA 
GAAAAAGAGA 
AATGCATTTG 
GCACAAGGTC 
AAACAGCCTG 
TCTTATTCAG 
ACAAAAGTAG 
TTTGCTGTAG 
AAAGATGAGG 
AATGCAGAAA 
GGGTGCACAT 
TCAGGAGGTG 
GGATGATTCG 
TGAGAATAAA 
GATTGAAGTA 
TGCGGTTTCT 
GGTCATTGTA 
TCAAGACCAC 
TTCTGAAAAA 
GAGAACTCTC 
CTATCGTTTC 
TGTCCAGCTG 
TGGCAATGTT 
GTTCAAAGTC 
AGTAGTTGTA 
ATACTATTTA 



CAAAAGTTGT 
CTGGGGCAGA 
CAGATAAAGT 
GAGCAGTTGC 
AAGTTGACAA 
AATGGGTGGC 
GCTTTGAAAT 
CTGGTTATGC 
CGACTGGACA 
TCAACAAAAA 
CGGGGGCTGC 
ATCAACTTGC 
ATGATTAGTC 
GCAGTCCAAG 
GTTAGTCAAT 
TATTCCTATG 
CTTTCTTCTT 
TCTCATATTC 
TATCCAGCTG 
GCGAAAAAAA 
AATCGCTTGG 
GAGGTTCCCT 
TATACTGATA 
AAGGAGGTGG 
GTAGATCATC 
GACTTTATGA 
ATGAAAGAAG 
ACATCAGGGA 
TGGGAGCTCC 



18900 
18960 
19020 
19080 
19140 
19200 
19260 
19320 
19380 
19440 
19500 
19560 
19620 
19680 
19740 
19800 
19860 
19920 
19980 
20040 
20100 
20160 
20220 
20280 
20340 
20400 
20460 
20520 
20580 
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AAGCTCCAAC TGGTTATGTT CAATTAACAT CGCCTGTTTC CTTTACAATC GGGAAAGATA 20640 

CTCGTAAGGA ACTGGTAACA GTGGTTAAAA ATAACAAGCG ACCACGGATT GATGTGCCAG 20700 

ATACAGGGGA AGAAACCCTT GTATATCTTG ATGCTTGTTG CCATTTTGTT GTTTGGTAGT 20760 

GGTTATTGTC TTACGAAAAA ACCAAATAAC TGATATTCAA TGTACATCAT TATGAATAGG 20 820 

ATAGCAGGCT GAAGGGAAGA CCAGAGTACT CTGAGGTGAT GTTAATCAGG AATCATGGTG 20880 

ATGTGGCATG AATCATCAAT AACGGATATG AGGCTGGGCA GATTGTGCCA GCCTCATTGT 20940 

GGGTTATTGT TTGTAAAACG ATAGGACTGG TCTGGTAATC ATTTTA 20986 



(2) INFORMATION FOR SEQ ID NO: 55: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21040 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 



CCCAGCAAAA 


AGCCATCCGA 


AGATGACTTT 


TTTGCTATTT 


AATTTCTGTA 


TAAGTTACTT 


60 


CCAAGCCACG 


CTTAACAGCT 


GGACGATTGG 


CAATTTTTTC 


TGCCCATTTT 


ACTAGATTTT 


120 


GATAACTTGA 


GGCATCCAAG 


AATTTTGCAG 


AACCTTGGTA 


AAGATTTCCT 


TGAACTAACT 


180 


GTCCATACCA 


AGACCAGATA 


GCAATATCTG 


CAATCGTATA 


GTCATTGCCT 


GCAATATAAG 


240 


GTTTCTGAGC 


CAATTCCTTA 


TCCAATAAAT 


CCAACTGGCG 


TTTCACTTCC 


ATCGTAAAAC 


300 


GGTTAATAGG 


ATATTCCAAT 


TTTTCAGGAG 


CATAATTGAA 


GAAATGTCCA AATCCCCCAC 


360 


CTAGAAAAGG 


TGCTGCACCT 


GCTTGCCAGA 


ATAGCCAATT 


CAAAACTTCT 


ACCTTTTCCA 


420 


CAGGATTACT 


TGGTAAAAAG 


GCTCCAAATT 


TCTCAGCAAG 


GTAAAGAAGA 


ATATGAGCAG 


480 


ACTCAAAGAC 


TCTTACGTTT 


TCAGTACCTG 


ACTGGTCCAA 


TAAGGCTGGA 


ATCTTGGAAT 


540 


TTGGATTGAG 


CTTCACAAAG 


TCTGATCCGA 


ATTGATCCCC 


ATCCATGATA 


G CAATCTT AT 


600 


ACAAGTCGTA 


AGCCGCTTCC 


TTAAAACCAG 


CTTCTAGTAA 


TTCTTCCAAT 


AAGATAGTAA 


660 


CCTTCACACC 


ATTTGGTGTT 


CCCAGTGAAT 


AAAGCTGAAA 


AGCTTGTTCT 


CCTTTTGGCA 


720 


AGTTTTGTTC 


GAAACGGGCA 


CCTGCTGTTG 


GTCTGTTTAG 


CCCCGTAAAA 


GCTCCTTGAT 


780 


TACTAGCTTC 


ATCCTGCCAT 


ACGGTCGGTA 


ATTGATATGC 


TGACATCCGA 


AACCTCCCTT 


840 


AAATCGCATT 


CTTGTCAAAA 


CCGAGTTTGC 


GTTGAATAAA 


CTTAACGATT 


TCGACGATGA 


900 


TAATCATTGA 


GAAGCTTCCA 


GCCATAACAA 


TTCCCCATTG 


TGACAAGTCT 


AGTTTGGTTA 


960 


CGTGGAAGAT 


TCCTTCAAGC 


GGTTCTACAA 


CGATTGTTGC 


CATGAGAAGG 


ATAAAGGATA 


1020 
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CCAAGATGGA CCAGTTAAAG GTCTTAGACT TGAATGGGCC AACTGTCAAG ATGGATTGGT 1080 

AGACAGACTT GACATTGTAG GCATGGAAGA GCTGAATCAA ACCAAGGGTT GCAAAGGCCA 1140 

TCGTTAGGGC ATCTGCATGA ATAGCATGAT TGTCACCCAC ATGAACTGGG TAAGCAATCG 1200 

CAAGGCCATA AACACTCATA ACAAGAGCTG CTTGGAGTAC ACCTTGATAA ATGATAGAAC 1260 

TCAAAACACC ACCTGAGAAG AAGCTTGCCT TGCGTCCACG TGGTTTATGA TTCATGACAC 1320 

CAGGTTCCGC AGGTTCAACA CCAAGAGCGA TAGCTGGGAA GGTATCCGTT ACCAAGTTGA 1380 

TCCACAAAAG ATGAACCGGC TGTAAGACAT CCCAACCAAA CAAGGTTGAT AGGAAGATGG 1440 

TTAATACTTC AG C AG T ATT A GCAGAAAGTA GGTACTGAAT AGTCTTTTGA ATGTTTGAGA 1500 

AGACCTTACG TCCTTCTTCC ACTGCGACGA TAATAGTCGC AAAGTTATCA TCTGCAAGAA 1560 

TCATATCAGA AGCCCCCTTA GAAACCTCTG TACCAGTGAT TCCCATACCG ATACCGATAT 1620 

CGGCTGTTTT CAGAGCTGGC GCGTCATTGA CACCGTCACC TGTCATGGCA ACGACTTTAC 1680 

CTTGTTTTTG CCAAGCCTTG ACGATACGAA CCTTGTGTTC TGGAGACACA CGGGCATAAA 1740 

CAGAGTATTG ACCAACGACT TTTTCAAATT CTTCATCTGA CAGTTCATTG AGTTCAGCAC 1800 

CAGTTAAAAC GTGACCTTCT GTATCGTTTG CGTCAATGAT TCCCAAACGT TTGGCAATGG 1860 

CTTCCGCTGT GTCTTGGTGG TCACCTGTAA TCATAATTGG ACGGATTCCC GCTTCCTTAG 1920 

CCACACGAAC AGCCTCAGCG GCTTCAGGAC GTTCAGGGTC AATCATCCCA ATCAAACCAG 1980 

TAAAAATTAA ATCATTTTCA AGCTCTTCAG AAGTGAGATT TTCTGGAATA CTATCGATAA 2040 

TCTTATAAGC ACCTGCAAGG ACACGCAAGG CTTGATGAGC CATTTCAGAA TTGTTTGTAC 2100 

GAATGAGATT TGTAACCTTC TCATCAATCG GAGCAATATC CCCAGCCTTA TCACGAAGAA 2160 

GACAACGTTT TAAGAGTTGG TCTGGCGCAC CCTTGACTGC TACAAGGAAA CGACCATCTG 2220 

GCAATGGGTG AACTGTTGAC ATGAGCTTAC GGTCAGAGTC AAATGGCAAT TCAGCTACAC 2280 

GAGGATATTT CTCTAAGAAA CCTTTGACAT CATAGCCCTT GTCCAAGGCA TATTGGATAA 2340 

AGGCTGTTTC GGTTGGGTCA CCAATCAAGT TACCTTCCAC ATCGATTTTC GTATCATTGG 2400 

CCAAGACAAC TGAACGAAGT AGTGGCATTT CAAGACCTAG TTCAATATCA TCAGCTGAGT 2460 

CATGTAGAAC CGCATCGTAG AAGACTTTTT CGACTGTCAT CTTGTTCATA GTCAGCGTAC 2520 

CAGTCTTATC AGAAGCGATG ATTTCAGTTG AACCAAGTGT TTCAACTGCT GGCAACTTAC 2580 

GAACGATGGA ATGTCGTTTG GCCAAAACTT GAGTACCAAG AGAAAGAACG ATGGTAACGA 2640 

TAGCAGGAAG TCCTTCTGGA ATGGCTGCAA CGGCAAGGGC AACAGAAGTC AACAACTCAC 2700 

CAAGTGGATT TTTCCCTTGA ATGAAGACAC CCACTACAAA AGTAACAAGG GCAATGACCA 2760 
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AGATAGCATA GGTCAAGACC TTAGAAAGGT TGTTCAAATT TTGTTTGAGT GGTGTATCAG 2820 

TCTCATCCGC ATCTTGAAGC ATACCAGCAA TATGACCAAC TTCAGTGTAC ATACCTGTAT 2880 

TGACAACAAC ACCCATCCCA CGACCATAGG TTACGTTTGA GTTTTGGAAG GCCATGTTGA 2940 

CACGGTCACC AATACCAGCA TCTGTCGCAA GCTCGACTGA CAAGTCTTTT TCGACTGGTA 3000 

CAGATTCACC TGTCAAGGCT GCTTCTTCAA TTTTAAGAGA GTTGGCTTCT ATCAAACGTA 3060 

GGTCCGCTGG TACCACGTCA CCTGCTTCAA GGGCAACGAT ATCGCCTGGT ACCAATTCTT 3120 

TAGAGTCAAT CTCTGCCATG TGTCCATCAC GAAGAACGCG GGCAACTGGA CTAGACATGG 3180 

ATTTGAGGGC TTCAATAGCT TCTTCAGCTT TTCCTTCTTG GTAAACACCA AAGGCAGCGT 3240 

TGATGATAAC CACAGCTAGG ATGATAATGG CATCTGCGAT ATCTTCCCCA CCAGAAGTCA 3300 

CGACTGACAA GATTGctGCC GCAACTAGGA TGATAATCAT CAAATCCTTA AATTGCTCGA 3360 

TGAATTTGAC CAAGATTGAT CGTTTCTCGC CTTCTTCGAG TTCATTGTGC CCAAATTCGG 3420 

CAAGGCGCTT TTCCGCCTCA CTTGATGACA AACCTTGCTC GGTCGCATCC ACAGCCTGCA 3480 

AGACCTCTTC AGGGCTCTGA GTATAAAACG CTTGGCGTTT TTGTTCTTTT GACATGTGTC 3540 

TCCTCCTTGA CATTGTGTGC AAAACAGACT CTCTTTCTGT CATAGCTTTT CACGACAAAC 3 600 

AAAAAGAAAC CTGTTAATCA TAACAAGTCT CGCTGTTTAA GATAGGGCCG GAAAGCATAC 3660 

TTTTCAGCAT AAAATTCGGA ATGACGACAC TATCACAGGT TTCTGCCAGC TACTCCCTTG 3720 

AGTAGTACCA TTATACCAAA TTTTGGGGAG TTTTCAAAGA GTAAAAACTG CCTTATTTGA 3780 

ATTTTTCCTT GAAAACCAGT ATAATGGTAG AATGCTATGT GACTAGAAAG GAAGTTGAAT 3840 

GAAGCAATCT ATCTCAAATC TCAAGTTAGC TGAGCGTGGA GCCATTATCA GTATTTCGAC 3900 

CTATTTGATC TTGTCTGCAG CCAAATTAGC AGCTGGTCAT CTCCtTCATT CATCCAGTTT 3 960 

GGTGGCCGAT GGTTTTAATA ACGTATCGGA CATCATTGGA AATGTGGCCC TCTTAATCGG 4020 

GATTCGGATG GCGCGCCACC TGCAGACCGT GACCACCGTT TTGGTCATTG GAAGATTGAA 4 080 

GATTTGGCAA GCTTGATCAC TTCTATCATC ATGTTCTATG TCGGTTTCGA TGTTCTAAGA 4140 

GATACCATTC AAAAGATTCT CAGTCGGGAA GAAACGGTCA TTGATCCTCT TGGTGCAACT 4200 

CTAGGAATCA TTTCTGCAGC GATTATGTTT GTGGTCTATC TCTACAATAC TCGCCTCAGT 4260 

AAGAAATCCA ACTCCAATGC GCTGAAGGCA GCTGCTAAGG ACAATCTTTC TGACGCTGTT 4320 

ACCTCACTTG GAACCGCCAT TGCCATCCTA GCTAGTAGTT TCAATTATCC GATTGTGGAT 4380 

AAACTGGTTG CTATCATCAT CACTTTCTTT ATCTTGAAGA CTGCCTATGA TATCTTCATC 4440 

GAGTCTTCCT TTAGTCTTTC AGATGGCTTT GACGACCGCC TGCTCGAGGA CTACCAAAAG 4500 

GCTATCATGG AAATTCCCAA AATCAGCAAG GTCAAATCGC AAAGAGGTCG CACCTACGGT 4560 
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AGCAACATCT ACCTGGATAT TACACTAGAG ATGAATCCTG ACTTGTCTGT TTTTGAAAGC 4620 

CATGAAATCG CGGATCAGGT CGAGTCTATG CTGGAGGAGC GTTTTGGCGT CTTTGATACC 4680 

GATGTCCATA TCGAACCAGC ACCTATCCCT GAGGATGAAA TTTTAGACAA TGTCTATAAA 4740 

AAATTGCTTA TGCGTGAACA ATTGATTGAC CAAGGAAACC AACTAGAAGA ACTCTTGACT 4800 

GATGATTTTG TCTATATTCG CCAAGATGGA GAGCAGATGG ATAAAGAGGC TTATAAGACC 4860 

AAAAAAGAGT TAAATTCTGC TATCAAGGAC ATTCAAATTA CTTCCATCAG TCAAAAAACC 4920 

AAACTCATCT GCTATGAGTT AGATGGTATC ATCCATACCA GTATCTGGCG TCGCCACGAA 4980 

ACCTGGCAAA ATATCTTTCA TCAAGAAACC AAAAAAGAAT AGAGAAATCC TTTCATGAGA 5040 

CGGGATTTTT CTATTCTTTT ATACTCAATA AAAATCAAAG TGCAAATTAG GAAGCCGGTC 5100 

ACAGGCTGTA CTTGAGTCGG CAATGTGAAG CCGACATAGT TTGCACTTTG ATTTTCGAAT 5160 

AGTCTTAACT ATCAAATTCA CTGAGATACT CATAGCGTTC GTATTTTTCA AGGAGTGCTT 5220 

CATTTTTCTC ATCCAATTCT TTTTGGAGAG TAGCCAGCTT ACCAAAGTCA GAGCCGTTAG 5280 

CCTGCATTTC CTCTTCAATA GCAGCGATAC GTTTTTCCAA GGTTTCAATA TCACCTTCAA 5340 

TACTTGCCCA CTCCTGCTTT TCTTGGTAGG TCATGCGTTT CTTGTCTTCT CGAACCTTGA 5400 

CCACTTTTTC CTTTTCGGCC TTTTGCACTT GATTGGCCAT ATCTGTTTCA AAAGCTTTTT 5460 

CATCAAGATA GTCGGTGTAA TGACCAAAGA AAGGACGAAT CTTGCCATCC TCAAAAGCGA 5520 

GAATCTTGGT CGCTACCTTA TCCAAGAAAT AGCGGTCGTG ACTGACTGTT AAAACGGGAC 5580 

CTGCAAAACC TTGCAAGAAA TTCTCTAAGA CTGTCAAAGT TGCAATATCT AGGTCATTGG 5640 

TTGGCTCGTC TAAAAGAAGA ACATTTGGTT TTTCCAAAAG CAGTTTGAGG AGATAAAGAC 5700 

GTTTTTTCTC ACCCCCTGAC AATTTCTCAA TCAAAGTCCC ATGCGTCGAA CGTGGGAAGA 5760 

GGAATTGCTC CAGCAACTCA GCGATGGAAG TCGTAGAACC ACCACTGGTC TTGACCTCCT 5820 

CTGCCACTTC CTGCAGGTAA TTGATCACAC GCTTGCTTTC ATCCAAACCC TCAATTTGTT 5880 

GAGAGAAATA GGCGATGCGA ACAGTTTCCC CAATCACAAC TTGTCCTGCT GTCGGCTCAA 5940 

GACTTCCTGC AATCAGGTTA AGTAGGGTTG ATTTTCCAAC ACCATTGTCC CCAACAATTC 6000 

CAATACGGTC TTTAGCCTGA ACTAAGAGAT TAAAATTTTG CAAAATGGGC TTATTTTCAT 6060 

AGGCAAAGGA AACATCCTGA AACTCGATGA CTTTCTTCCC AATCCGACTG GTTTCAAAGT 6120 

TCATAGTCAA GTCTGTCTCA GCACTACTGC CTGAAACTTC CTTTTTCAGA TCATGGAAAC 6180 

GATTGATACG AGCTTGTTGC TTGGTCGCAC GCGCCTGCGG TTGTCTGCGC ATCCAGGCCA 6240 

ATTCTTGTTT GTAGAGTTGT TCTTTTTTGT GAAGAAGAGC CGCGTCGCGC TCATCCTGTT 6300 
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CCGCCTTTAG GCGAACATAG TCCTGGTAAT TTCCCTGGTA CTCGGTCAAG CCTGCACGAT 63 60 

CCAACTCGAA AATCCGTGTT GACAAAGCGT CTAAGAAATA ACGATCGTGA GTGATAAAAA 6420 

GGACGGTCTT CTTAGAATTT TTCAAAAAGA GGGTCAGCCA CTCAATAATC GCAATATCCA 6480 

GATGGTTGGT CGGCTCATCC AAAAGCAAGA GGTCGTGGTT GCCAAGTAAG ACTTGTGCCA 6540 

ACTGTACCCG TCTTCTCAGA CCACCTGACA ATTCCCCAAC AGGAGTAGAT AAGTCTTGAA 6600 

TGCCCAATTT GCTAAGAACG GTCTTGACCT GACTTTCGAT TTCCCAAGCT TGGAGAGAGT 6660 

CCATCTCTGC CATGACACGT TCCAAACGCG CCTGCTTGTC CTCACTATAG TCGAGCATAA 6720 

TCAATTCATA CTCACGAATG AGCTGGATTT CCTTGAGTTC ACTAGATAGA ACCGTATCCA 6780 

AAACTGTCTT TCTATCATCA AAATCAGGAT CCTGAGTCAA GTAACCAATC TGGTAATCAT 6840 

TTTTAGCTGA AAAAGGACTG ACATCCCCAT CAAATCCAGA AACACCAGAA AGGACGTCCA 6900 

AAAGGGTGGT CTTGCCAGTC CCATTGACAC CGATTAAACC AATTCTGTCT AAGTCATGGA 6960 

TAATAAAGGA AATATCCCTA AAAACGGTCT TGTCACCAAC GGATTTACTT AGTTTTTCAA 7020 

CGATAAAATC ACTCATTTTT TCTCCCTCAG GTAAGCATGG ATGGCTTCAC GATTATTCTC 7080 

CAATTCTCCA TCGACAATGG CAAACTCAAT CTCTGTTAAA ATCTCTCCCA AGTCTGGGCC 7140 

TGGCTGATAG CCATATTCCT TGATCAAAAT ACCGCCATTA ATCTGAATCT CTTTCTTGTC 7200 

ATGGATAGTC AAGCTTTGGT ATTTTTCTGT GATGGCTTGT GGGTTGACTT CTTTTCCTTG 7260 

AGCTTGACGA AGATTTTCAG CCTGTAAAAG CAAATCTATG TCAAAGCGAT AACAATCTCG 7320 

CTTGCTCAAT TCTCCATTTT CACGCAGAGC CAAAATAATC AGCAAATCCT GAACTTGCTT 7380 

GGCAAACTGG CGTGAGGTCT TCCAAGATTT CAAAAATGAC TGCGCATTTT CAATCTCCAA 7440 

AGCCCATAGT AAAGCCGCCC AGGCTTGTTC AGAGGATTCA AAAGTAAAAT CAGTCTCCAA 7500 

ATCAAACAGT CTGTTGAGCT TGTCCTGGCT AGATGCCATA TCAGGGAGAT AGTCATAAGC 7560 

TTGACTCTCA ATCATGGAAG CCAAGCCCCT TCTCCAAAAT GGAGCCAGCA AGAGTTTATC 7620 

AAACTCGACG AAGGTACGCT CTACAGAAAT TTTCTCCAAA AGCGGCGTCA AGGTCTTCAT 7680 

AGCTTTAAAT GTTTCTGGCT CAAGTGCAAA ACCAAGACTA GCCTGAAAAC GGAAACCACG 7740 

CATAATCCGT AAAGCATCTT CGTTGAAACG CTCACTAGCC ACTCCAACTG CTCGCAAGAC 7800 

TTGCTTTTCC AAATCTTCTA AACCATGGAA CAAGTCAACG ATTTCTCCTG TCTCATCCAA 7860 

GGCAAAGGCG TTGACTGTGA AATCACGGCG TTTGAGGTCT TCTTCTAGCG ATCGTACAAA ~ 7920 

GGAAACCGCA CTGGGTCTGC GATAGTCCAC ATAGACATCC TCTGTCCGAA AGGTTGTTAC 7980 

CTCATACTCC TCATCCCCAT CTAAGACCAA GACGGTTCCA TGCTCGATTC CGATATCGGC 8040 

TGTTCGCGGA AAAATCTGCT TGGTCTCTTC TGGATAAGAA GACGTCGCAA TATCCACATC 8100 
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GTGGATAGGG 


CTATGGAGAA GGGCATCTCG AACAGAGCCC CCAACAAAAT AAGCCTCAAA 


8160 


GCCTGCTTCT 


TTAATTTTTT CTAATACTGG TAAAGCCTTC TGAAATTCAG AAGGCATTTG 


8220 


CGTTAATCTC ATAATAAGTG TTCTAATCCA TAGACAAGCT CATGACGCTT GACAACTTCT 


8280 


TTAATTCCCA 


AATTGACTCC TGTCATGAAG GAGATGCGAT CATAGGAGTC ATGACGGAGG 


8340 


GTCAACCCTT 


CTCCCTGATT GCCAAAGATG ACTTCCTGAT GAGCTACCAA GCCTGGCAAA 


8400 


CGAACTGAGT 


GGATGCGCAT ACCATCAAAG TCAGCACCAC GAGCACCAGC AATCAGCTCT 


8460 


TCCTCATCTG 


CTGCACCTTG CTGAATTGAC TCTCGAACCT CTGCCATCAA CTLAW- 1\> 1 1 


8520 


TTAATGGCTG 


TTCCACTCGG AGCATCCTTT TTCTTGTCAT GATGGAOL 1 1- aaiaaivill 


8580 


ACATTTGGGA 


AATATTTGGC AGCCTGCGTC GCAAATTGCA TGAGTAAGAC AGCA(_t_<_AA<j 


8640 


GCAAAGTTAG 


GGGCAATCAG GCCACCCAAG TCTTGGGCAC GAGAAAATTC TTTTAGCTCT 


8700 


GCAATTTCTT 


CACTCGTGAA ACCAGTCGTT CCAACTACTG GAGCAAAGCC ATTTTCAAGA 


8760 


GCAAAACGTG 


TATTTTCGTA GGCAACAGCT GGAGTAGTAA AATCTACCCA GACATCCGCT 


8820 


TCAAAACCAG 


CTAAATCAGC CTTATCCTTG AAAACAGGAA TACCCTGCCA TTCTGACTCA 


8880 


GACTCAAAAG 


GATCCAAAAC TGCCACCAAG TCCAAGTCTG GATCAGTCAA TACCATCTGA 


8940 


CAAGCAGCCT 


GGCCCATCTT TCCCTTAAAA CCGGCAATAA TTACTCGAAT ACTCATCTCT 


9000 


ACTCCTGTCT 


AAGATACAAA GTCCGTAAGA ACACAAAGTG AAAATAGGAA TTCCAATCAA 


9060 


GAAGTGTCTA 


CTTCTTGGAA GAACTATCTT TTTCACACAG GGTTCCAGOL. OitllVAAii 


9120 


ATCAAGATAC 


AAAGGACCTT AGCTGCCTCT GAAAAATAGG GAATGGCACT GACTTTCCAC 


9180 


GAAAGGCAAG 


ACAGGCATCT TTTTTCAAGA GGCAGGTAGT CCGTGTTCAA TTTCTAAGAT 


9240 


ACAAGGCATC 


TTAACTAGCC TAGAAGCGCC AACTAAATCA CTGGAATATA ACCCAGAGCA 


9300 


ATACTTCCTG 


CTCCTAGGTG CGTTCCAATG ACACTACCAA ATGTAGCAAG TGAAACATCC 


9360 


GAACCCAAGC 


CAAAATCAAG CAAGTGcTGA CGCAATTCTT CAGCCTTTTC AGGAGCATTC 


9420 


CCATGAATGA 


CAATGACCCG GTATTGACCT GAAGCCGTTG TTTCCTTGAT AATTTCAATT 


9480 


AAGCGCTTGG 


TGGCCTTCTT TTCAGTACGA ACTTTTTCGT AAACTTCAAT CACACCTTGA 


9540 


TCGTTAAAAT 


AAAGGATTGG CTTAATGCTA AGCAAATTGC CCAAAATGGC AGCCCCATTT 


9600 


GAAAGGCGTC 


CACCTTTTAC CAAATGATCC AAGTCATCTA CCATGATAAA GGCTGACGTA . 


9660 


CGGCTGATTT 


GAATGGCTAG CTTATCCTGA ATGCTGGCAA AATCATCGCC CTGATCACGC 


9720 


CAATTAAAGA 


CGCTTTCAAC CATGATGCCT AGGGGAGCAC TTGTAATCAA AGTGTCTGGG 


9780 


AAAGCAATGG 


: TTAAGCCCTC ATAGTCATCG ACCATATACT GGATATTTTG GTAAAAACCT 


9840 



WO 98/18931 



PCT/US97/19588 



496 

GAAATTCCAG AAGATAGGAA AAGCCCCAAG GCATGTGTAT AGCCTTGTTC TTTGAGCGAA 9900 

GTTAAGATCT CATCTAACTT GGCAATACTT GGTTGACTGG TCTTAGGCAA TTCAGAAGCC 9960 

TGAGCCATTT TTTGGTAAAA TTCCTCAGCA GACAGATTGA TGCCTTCGAC ATATTCCTCA 10020 

CCATCAATAT TGACAGGAAT ATCCAAGACA AACAAGTCTT CTCTTTGCAA GATCTCTGCA 10080 

CTGAGATAAG CAGAGGAATC TGTGAAAACA GCTAATTTCA TATTAGAACT CCAAATTAAT 10140 

TCCTGGTAAG TCTAATGCAA TTTCAGTCAC TTCGTAAGTC AAACGATTGA GCATGTTCAA 10200 

ACATGGACGA GCCAAGGTTT CCACCTCTTC TTGGTTCAAT TCACTTGGTT CATTGACAAT 10260 

ACGGCCATCG ATATGGTTTA CTTGTGAGAT TGTTCCACTA ATGACAAACT TATCAAATAC 10320 

AATCATAAAG CTCAAGATGA CAATCAAGGA AGTCACTTGA TTTTCTTGGT CATGTTGGAG 10380 

CAATTGGAAA TTCACATCCA CCTTGGTTTC AGGAGCTCCA TTTTCATTTT CCCATTCAAA 10440 

ATTACGCGCA TCAAAATGAT ACTGACTAAC AAATTCTTGT TCACGTTTAA GATTCATGTC 10500 

TTTCTCCATC GGCTACAATA TTATAAGCTA TTGTACCATA ATTTTTTATT TTCATCTAGT 10560 

TTTCTAGGAT TTAGTCAATC CCAATTTCAG CACGAACTAC ATCTGTGATG GTATCAACAT 10620 

AGTAGTTTAC TTCTTCTGTT GTAGGCGCTT CTGCCATAAC ACGCAAGAGG GGTTCTGTTC 10680 

CACTTGGACG AACAAGGATA CGGCCGTTCC CCGCCATTTC TTCTTCCATC TTCTCGATGA 10740 

TGGCCTTGAT AGCTGGCACT TCCATGGCCT TTTCCTTCAT GACGTTTTCC ACTCGGATAT 10800 

TAACTAATTT TTGTGGATAA ATCGTTACTT CTGCCGCCAA CTCTGATAAG CTCTTACCAG 10860 

TTTCCTTCAT GATTTTAGTC AATTGAACTG CTGATAATTG ACCATCACCT GTGGTATTGT 10920 

AATCCATCAA GATAACGTGA CCAGACTGTT CACCACCAAG GTTGTAGCCT GATTTTCTCA 10980 

TTTCTTCAAC AACGTAGCGG TCACCAACTG CAGTAACTGC CTTGTTAATA CCTTCGCGAT 11040 

TCAAGGCCTT GTGGAAACCA AGGTTAGACA TAACAGTTGT CACAATTGTA TTTTGAGCCA 11100 

ATTGTCCTTT TTCAGAAAGG TATTTTCCGA TGATGTACAT AATCTTGTCA CCATCAACGA 11160 

TGTCACCATT CTCATCAACA GCAATCAAGC GGTCACTGTC TCCATCAAAG GCCAAACCAA 11220 

TAGCTGACCC ACTTTCTTTG ACCACTTCTT GAAGGGCTTC TGGATGTGTT GAACCAACAT 11280 

TAAGGTTGAT GTTAAGACCG TCTGGTGTTT CCCCGATAAC CGTCAATTGG GCACCAAGGT 11340 

CTGCAAAGAT TTGACGGGCA CTGGTAGAAG CTGCTCCATT AGCTGTATCC AAGGCAACCT 11400 
TCATTCCATC AAGAGGAGTT CCAGTTGAAA CAAGGTATCC TTCATACTTA CGCArGCtTC " 11460 

TGGATAATCT ACCAAAATTC CTAAGCCTTC TGCACTTGGA CGAGGAAGAG TGTCTTCCTC 11520 

AGCATCTAGC AAGGCTTCAA TTTCTGCTTC TTTTTCATCA TCTAGTTTGA AGCCATCACC 11580 

GCCAAAGAAC TTGATTCCGT TATCAAGGGC TGGGTTGTGG CTAGCAGAAA TCATGACACC 11640 
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GGCACTTGCT 


CCTTCAGTTT 


CAACCAAGTA 


AGCTACTGCT 


GGTGTTGCAA 


GGACACCAAG 


11700 


TTTGTATACG TGAATCCCTA CTGAAAgGAG ACCTGCCACC AAGGCCGATT 


CCAACATTTC 


11760 


CCCTGAAATA 


CGTGTGTCAC 


GTCCTACAAA 


GACTTTCGGC 


GCTTCCGTTT 


CATGTTGACT 


11820 


AAGAACATAG 


CCTCCAAAAC 


GTCCTAGTTT 


AAAGGCTAAT 


TCTGGTGTTA 


GTTCT AG GTT 


11880 


AGCTTCTCCA 


CGGACTCCAT 


CAGTCCCAAA 


ATATTTACCC 


ATTGTTATAA 


AATCCTTTTC 

ItX» X 1 X X X *— 


11940 


Inl L X X i/Vll 


TCGTTTTTGA 


ACTAGTTGCT 


TTCGTTGACG 


AAGATGTCTC 


CG ATC5 AACTC5 

V- vj x X VJ 


12000 


1VJ1 l X w 


AATTTGATGT 


GCTTGAACTT 


GGTGCTACTG 


GTTTTGTAGT 


PAPP*I V PPA , P'T , 


12060 


ATTGTATCAA 
x»x luini win 


ACGGAGTGAT 


AACTGCCGGT 


AAGACAACAC 


CATTGCGGTC 


CI A TTTJ PP*IW" 


12120 


AAAGGTACTG 


AACCACTGTA 


ATTACCTGTT 


ATACGTTCGC 


TAGTTGGCAA 


AAPAfiPfiATA 


12180 


ATPTTATCAA 


TTCTATCCAA 


TGTCTCTTGG 


TCACTCGTAA 


TAGACACTTC 


TTTATPTYJAP 


12240 


APPATGAPAT 


TTTCAATTTG 


TACCCGACTA 


TCAATTTGAC 


TAGGGTCAAT 


fPrTflRTftPA 

V, 1 V_ X V*\B 1 


12300 


ATPTTTAPPT 

nil 1 xm_\_ x 


TATCCTTCTG 


AGCCTTCTTA 


CCAATCTTGA 


CTGTAATTTT 


r v r vmrciczhci r vc 

X 1Aj^u\j/vj X w 


12360 


rip pap a f^pnn 


TPARPPCATT 


GGGTAAATCT 


TCAATGCTCA 


AAGGAACTTC 




12420 


APAPPGGPAT 


CTGTTAGGTC 


AGCAGTAACC 


TTGAATTTAC 


GTGTACTTTC 




12480 




AT AG G CG ATT 


TGCACCAGTC 


AAGACCACTG 


ATACTTCTGA 




12540 


PTAATAAAAT 


ACTTATCACT 


ATTATAGCGT 


ATGTCAATAG 


GGACATTTGT 


*P APTWI 1 ATT A 

X n\> X U X n X In 


12600 


GTATAGGTTT 


CCGTTTTTAC 


CTGCCTAGCA 


CTGGTACTGT 


TTTGAAAATT 


ccvrcaccGTA 

\_VJ X V«VJV~V»VJ X *» 


12660 


GCATAGACAA 


ATAAGACACA 


AGCAAAAAAG 


AGTGAGGATA 


TGATATATAA 


APTA'PTfTTT 
xnxxxxxx 


12720 


TTCATGTTTC 


CATCCTCCTA 


GCAATCGTTC 


TTTAAAACTA 


AGACCCACTT 


PPTPTTTTfiPi 
^^■x^x x x x vjvj 


12780 


AAGTAAGATT 


TCACGTAATT 


CTGTTTCAAA 


TTCATCAAGT 


GTTAGGTTGT 


GCTTAAACCT 


12840 


TCCATTATAG 


GTTATCGAAA 


TTCCTCCCGT 


TTCCTCTGAT 


ACGACAAAAG 


TCAAGGCATC 


12900 


TGAGACTTCT 


GATAAACCGA 


TAGCCGCCCG 


GTGTCTGGTC 


CCAAATTCCT 


TGGAAATCCC 


12960 


TGTGTTTTTT 


GTCAAGGGCA 


GATAGGCAGA 


CGTCACAGCG 


ATACGTTCTT 


CTTTGATAAT 


13020 


CACCGCACCA 


TCATGTAGGG 


GAGTGTTGGG 


AATAAAAATG 


TTAATGAGAA 


GTTCTGCAGA 


13080 


AATCTTAGCA 


TCCAAGGGAA 


TTCCTGTCGA 


AATATACTCC 


TGCAAGGTAC 


GTACACGCTG 


13140 


AATAGCAACC 


AAGGCCCCGA 


TTTTACGAGG 


ACTCATGTAT 


TCAACAGACT 


TAACAAAGGC 


13200 


ACGAATCATC 


TGTTCCTCAG 


CACTAATAGG 


GGCATTGGAA 


AAGAAATCTG 


TCGCTCTTCC 


13260 


CAAACGTTCC 


AAACCAGTCC 


GAATCTCTGG 


AGAGAAGATA 


ACAACCGCCG 


CAATAACCCC 


13320 


ATAAGTAATA ATTTGATTGA TTAACCAAGA AATCGTAGTC AAACCAATCA 


TATTTGCAAG 


13380 
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GATTTGAGCT AAAATAAACA CCAAAACTCC ACGTACCAAA ATCATAATCT TGGTTCCTGC 13440 

AATAGCTTTT GTAAAATGGT ATAAAATATA AGCAACAATC AAAATATCAA TCAGATTGAT 13500 

AGCTATCGTC CATGGACTTG CAAACAAACT GGTCCAATAT TGCAGATTGG ATAATTGTTG 13560 

AAAATTCATC CCTGATATCC TCCCTATCAA AACACTTTCG TCCTATTATA CCATTTTCTG 13620 

GCATTTTTTT CCCTATCCTA GTCCATTTTA CATTGAACAA AAATATGATA AAATAAACTG 13680 

ACTAAAAAAA ACAAAGGAGA AACTATGTCT CAACTCTATG ATATTACCAT TGTGGGTGGT 13740 

GGTCCTGTCG GGCTTTTTGC AGCCTTTTAT GCCCACCTAC GCCAAGCCAA GGTTCAAATC 13800 

ATCGACTCTC TTCCCCAGCT AGGTGGACAA CCTGCTATTC TCTACCCTGA AAAGGAAATC 13860 

CTAGACGTAC CAGGCTTCCC AAACCTGACT GGAGAAGAGT TGACTAACCG CTTGATTGAA 13920 

CAGCTAAATG GATTTGATAC CCCTATTCAT CTCAATGAAA CGGTTCTTGA GATTGACAAA 13980 

CAAGAAGAAT TTGCCATCAC AACTTCTAAA GGAAGTCACC TGACTAAAAC AGTTATCATC 14040 

GCTATGGGTG GCGGTGCCTT CAAACCACGT CCGCTGGAAC TTGAAGGGGT TGAGGGCTAT 14100 

GAAAATATCC ACTACCACGT TTCTAACATT CAGCAATACG CTGGTAAGAA AGTGACGATT 14160 

CTTGGTGGGG GAGACTCGGC TGTGGATTGG GCTTTGGCTT TTGAAAAAAT CGCACCAACT 14220 

ACCCTTGTTC ACCGCAGAGA TAATTTCCGT GCCTTGGAAC ACAGTGTTCA AGCCTTGCAA 14280 

GAATCATCTG TAACCATCAA GACACCATTC GCCCCTAGCC AACTCCTTGG AAATGGAAAA 14340 

ACACTTGATA AACTTGAAAT CACAAAAGTC AAATCTGATG AAACTGAAAC CATTGACCTA 14400 

GACCACCTCT TTGTCAACTA TGGTTTCAAA TCTTCTGTCG GTAACCTTAA AAACTGGGGG 14460 

CTCGACCTCA ACCGTCACAA GATTATCGTC AACAGCAAAC AGGAATCCAG CCAAGCAGGT 14520 

ATCTATGCTA TCGGTGACTG CTGCTACTAT GACGGAAAAA TTGATCTGAT TGCGACAGGC 14580 

CTCGGAGAAG CTCCAACTGC TGTCAACAAC GCTATCAACT ACATTGACCC TGAACAAAAA 14640 

GTACAACCAA AACACTCTAC TAGTTTATAA AAAAGAACCA CGAGTCACAT AGGATTCGTG 14700 

GTTTTATAAT TCATCCGCTA TCTTATTGAT TTTTCTGAGT CTGTGATTGA CACCACTTTT 14760 

GGTCAGAGGG GTGCTGAGAC TATCTGCTAA CTGCTGGATA GAGTAGTCTG GGTGCTGAAT 14820 

CCTCAGTTGC GCCACTTCCT GCAAATCTAC TGGCAAATTT TCTAAGCCCA TGATATCTTT 148B0 

GATTTTACTG ATATTGTTAA TGGTCTTCAT GCTGGCAGAA ACTGTCCGAG CGATATTAGC 14940 

TGTCTCGGCA TTATTAGCCC GATTGAGGTC GTTACGGGTT TCTCGCAAAA TCTTAACCCG 15000 

CTCAAAATCA TCACGTGCCT GCATGGCTCC TATTACTATC AAGAAGTCCA TAATGTCTTC 15060 

TGCTCGCTGG AGATAGGTCA CAGCCCCCTT CTTGCGCTCA AGCACCTTGG CATCCAGTAA 15120 

AAACTGTTGG AGAAGGGAGG CAATTCCTTG CGCGTGGTCC AGATAAACAG AACTGATTTC 15180 
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CAACTGGTAC 
GAGATAGGCA 
GCCAAAGAAA 
AAAAACGGTA 
TTTGATTTCA 
TGTCACAACT 
AATGGCAGAT 
TACTGTGAAA 
CCATCGTGGA 
ACTTGCTTAC 
TTGGAATTCA 
GGGCGACCAA 
TCCCCACGTT 
GCCCGCCCAA 
CCTAGGACAA 
GTATCATCGT 
ATATGACTCT 
TGGTCACTGG 
GCATTATAGG 
TGGCCAGCAA 
TTAGGCATAT 
TGCATATTTT 
GCGATTTCCA 
CCAATCACCG 
TGTCGCGATG 
CAAATGCCAC 
TACCTTCCTT 
AGTCTTCTGA 
GGTTTCTCAG 



TTGCCTGACT 
CGACCTGCTT 
GAGTCTGCCA 
TAGACGCGAT 
TAGAGATGGA 
GACAAAGTCA 
AATTCATGCC 
CTCATTTTTT 
AGGCACCGCC 
AAAGACCTAC 
TGTATTCCTG 
GGTGACGATG 
GGGTCATGAT 
TTTCCTTAAT 
TCATGTCACT 
TTAGGGCATT 
CTCCAGCCAC 
AAGGATAAAT 
TTGAACCCTG 
AGGCTCCGGC 
CCGACATGGC 
TTCGGAGTTC 
CATCTTTTTC 
TTATCTTTGG 
CCCTTCATTA 
ACTACGGTGT 
TTGGTAACTT 
CTCAGGATGG 
TTCTGGTAAA 



CAGGGTCACG 
CCTCATCCGA 
AGTGCAAATC 
TCTTGCGAAG 
GAAAGGACTC 
AGCCCGAAGT 
AGCTCAGATG 
CACCTGTATA 
ATTTTCCAGA 
AAAATCGTGT 
AGGCACTTTT 
CAAGACTTCC 
ATTGCAGACA 
CACGATATTG 
TTCAAGGATG 
GGTCACATAG 
TTCTGTCCCA 
TTTCCCTGTT 
CATTTCTGAC 
ATCCTCAGAG 
CACAAGGACA 
ACCTGAAGAA 
CCGCAGACTT 
TTTTCTCATG 
ACAGACCAAT 
TGTCCACCCG 
GGCAGAATCG 
TTCATGACAT 
TAATAGGGAT 



499 
AATGCTCCCA 
TAAAATCGCC 
ACTTAACAAA 
ATTGCTCCGT 
ATAGAGGTGA 
CGAGAGACCG 
GTGTTGGCCC 
ATGCGCATCA 
CGAAGGAAGT 
TCCACTTGCA 
TCAATATTCA 
ACGTGGTCGC 
TAGGCAATTT 
GGCAAAATAG 
GTCTGCACTA 
ACATTGTCAA 
TCCTGAAAGA 
GTATGGAAAA 
AAGCCAGCAA 
AACCGATACT 
TTACGAAGAT 
CCACCATCAT 
TTTAGAATGA 
AACGGTTTAC 
TCTTGGATAA 
TACATCCCAT 
GCTCAATCAA 
AATCATAAAC 
TTGGCAAGAA 



TTTGCCAAGA 
TCATCAATAC 
TCCTGCACCT 
TGGTGGTGAC 
CGGGCCAGTT 
ATGCTACCAG 
AGGATTTCTT 
ACTCGTCCAC 
TAGATGAAAT 
CTAAGTATTC 
CCAAGACAGT 
TATCTGTAAA 
CTGCCTTGGT 
AGGTAAAGAG 
CTCGACGGCT 
TTATGCCTCG 
CTGCATGAAG 
ATTTGCTCAA 
TGATGAGATT 
GAAAGACCTT 
CACCTGGCGG 
CTGCCACCGT 
CGGGACTTCC 
CGTTTCCTTT 
GTCCTGCGCC 
GGCAATGGTC 
GGCCAATAAA 
AGGTTCATCC 
ACGGACATCA 



AAGCGCCACA 
CTGTTTCCAG 
TTTCATCTGT 
GAATTTCAGA 
TGGCATTTTC 
ACATTTTGAT 
CTTTTACTGC 
AATCAAATCT 
CACGCGCGAA 
ATCAAAACGG 
GTCGATAAAA 
GTGTTCCGTC 
TTCCAAAAGA 
GGAACCTGGC 
GGCCAGAGGC 
ATGGTCTACA 
GGTCAAAGGA 
TAACTGCATG 
TCCCAATGGA 
CTCATAAAAC 
TGTCAACTGT 
CACGATAGCT 
AGTCCCTCCA 
CTGCGGTCTT 
AAGCGTTTAG 
AAAACGGACT 
TGTTGATAAA 
ACACCCGTTT 
AAGACCAAGT 



15240 

15300 

15360 

15420 

15480 

15540 

15600 

15660 

15720 

15780 

15840 

15900 

15960 

16020 

16080 

16140 

16200 

16260 

16320 

16380 

16440 

16500 

16560 

16620 

16680 

16740 

16800 

16860 

16920 
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CCGCATCAAT 
GGGCTTGTTC 
GTTCAGTCGT 
CCAACTTGAT 
CCTTGTAACG 
AACCATCTTG 
TACGCATATC 
AAAACTTAGG 
AGGACTGAAT 
GTTGTTTCTT 
AACTAGCTTA 
GAGCTACCTC 
CATTAAAAGG 
CAAAGTCTGT 
GTTCTTGGAT 
TAGCAAACAA 
CCTTATCTGT 
TCCCAAGCCT 
GGGTCAAAGT 
ATCTTGTCCG 
TACATCTGCA 
CGGAGATTTT 
TTTTTAGAAA 
ATGTTAAAAA 
TTACGGTTCA 
GCAATCTCTG 
TCAGCATTGT 
CCGTTCCATC 
CCTCAAACAT 
TCTTCCGTTC 



CGGGATTCCA 
TTGGTCTGAA 
ATCCACCACA 
TCCATCTAAA 
AGCGACCAAT 
ATTTTCCAAC 
CACTACCAAG 
CAAGAGAGCT 
GGCTACAGTT 
TGTCATCTTT 
TCCAATCTCT 
CACAGCTGAA 
AACAAAGTCG 
TCCTGCTTCT 
ATTCTCTCCT 
AAGGTTGCCA 
ATGAATTGTT 
CATCCATCCC 
TTAATTTCTT 
CCTTGGTCGC 
CATCATCTGC 
CCCGAGTCGT 
CACGAGCATA 
AGTTCAGGAG 
ACATAGTGTT 
GCAGTTCATC 
GTGTATTAAG 
TACAGTTTCT 
GACATCTAGC 
GATTGCTTTA 



TACTTAAATC 
AACTGCTCTG 
TTTTGGCTCA 
ATACGACCGT 
TCCTTATCAG 
TCATCCAAAA 
GCCAACTTAG 
GGCGGCATAT 
TTCCCTGCGC 
TTCTCCTTAT 
GCGATGACTT 
CGAGCTGGGA 
TTCATATCGC 
GCCAAAATAG 
ACAATTTCCC 
ACGATTTTTC 
TTTGCCATTT 
TGCCTTACTG 
TTTGATTGCT 
CACAATGATG 
TGACGGGTCA 
TAAGTACTCC 
GCCATAACCA 
CTGGGTTTTA 
GATAAAGCTG 
CTGCGGATAG 
TTCCATAGTC 
TTAGTGATGC 
ATGGTTTCTT 
TTAGCAATCT 



500 
CGAAAGACAT 



CAAGGGTTTT 
TATTTTTCAA 
CTGCTGCTAG 
CCGCATCCAA 
CAGCTTGAAT 
GATTGTCTTC 
TATCAATGGT 
CACTCATCCC 
ATCAAAAGAA 
CAATTTCGAC 
ATTCCTCTTT 
TCAAGAAGCA 
CACCGATGTT 
CAGTTTCAGG 
CTTGAACATA 
TCTTTTCCTC 
ACAGATGAAA 
GATTCATGCT 
ACTGGAATCT 
TGACGAAGGT 
TCAATCATGC 
GGCACATCCA 
CCAGGTTTTC 
GATTTACCAA 
TGGGACTTAT 
ACCTCTAGGC 
GAACCAATTT 
CGATGATGGA 
CTTGAAGGGC 



GACTTCGATA 
GCGCAgCTCA 
AGGTGCCAAG 
TGGGTGACTC 
AAAGAGGATT 
CTCTGAAAAG 
CTTAATTTCA 
GAAATAACCT 
TGTCACAATC 
GTTTGGCAAC 
TTTTACATCA 
GAAGGCCGTT 
AGTTGTTTTG 
TTTCAAGACT 
GGATAGGGGA 
GGGTCCGATA 
ACAATTTTTC 
AGAGGATGAA 
TGTTCCATTT 
CATAATACTT 
CAACTAGACT 
ACCCCCACTT 
CAAAGCGCAT 
CTGATGTACG 
CATTTGAACG 
TAGCTGCACT 
TGTTTCTAGG 
CACATTTTCC 
GCGAAGTCCA 
TTCGTCGTCA 



CGGAAAGACT 
CGTGGAGTGA 
AGTTCACGTT 
CGTCrGGTTT 
TTGAAATCCA 
AAAGAACGGC 
ACCAGCTGCA 
AGATCCTCGA 
ACCAAGTGAA 
ACCAAACTTC 
CGAGGAAGAC 
TGGTAAACCT 
ACAACATGGT 
TGCTCTGTCT 
ACTTGACCGC 
GCCTTTGGGG 
TAAGATTGCA 
ATCGTCACTC 
ACCACGAGGA 
GAGAAATTCG 
GACAACCGCA 
TTCACGTTCC 
CTTGTCATCA 
GGCGAGATTC 
CCCTGCTAGG 
GAGCAAGATT 
ATCGGTTTAT 
TGACTCGGCA 
CGCGCCCCTG 
AATTCCAACT 



16980 

17040 

17100 

17160 

17220 

17280 

17340 

17400 

17460 

17520 

17580 

17640 

17700 

17760 

17820 

17880 

17940 

18000 

18060 

18120 

18180 

18240 

18300 

18360 

18420 

18480 

18540 

18600 

18660 

18720 
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CAACATCATC ATAAGAAAGC AAGGTTTGGT ATTGTTTCAC CAAGGCATTT CTTGGCTCTT 18780 

TCAAGATGCG AACCAAGTCA TCAACGGTCA ATTGCTCAAG AGCCGCAAAA ACAGGCAAGC 18840 

GTCCAATCAA CTCAGGGATA ATACCAAATT TTTGAATGTC TTCAGCGATG ATTTCTTGCA 18900 

TGTATGAGCT GTTTTCGTCA ATCGCCTTAT TATTTTGACC AAATCCGATG ACTTTTTCAC 18960 

CCAGACGTTG TTTGACAATT TCTTCAATAC CATCAAAAGC ACCACCCACG ATGAAGAGGA 19020 

TATTTTTTGT ATCCACTTGA ATCATCTCTT GTTGTGGATG TTTGCGTCCA CCTTGAGGCG 19080 

GTACGCTAGC AACAGTTCCC TCAATAATCT TGAGAAGGGC TTGTTGCACC CCTTCACCAG 19140 

AAACATCACG TGTGATAGAC ACATTCTCAC TCTTCTTGGC AATCTTGTCA ATTTCATCCA 19200 

CATAGATAAT GCCACGCTCT GCACGTTCGA TGTTAAAGTC AGCAACCTGC AAGAGTTTGA 19260 

GGAGGATATT TTCCACATCC TCACCCACAT AACCAGCCTC CGTCAGAGCT GTCGCATCCG 19320 

CAATAGCAAA AGGTACATTC AAGCTCTTAG CCAAGGTCTG GGCAAGGAAA GTTTTCCCTG 19380 

AACCAGTTGG GCCAATCATC AAAATGTTTG ACTTCTGCAA ATCCACATCT TCTGACTCTT 19440 

CGCGTGTATC GTGGAAATTG ATGCGTTTGT AGTGGTTATA AACCGCCACT GCCAAGGCAC 19500 

GCTTGGCACG ATCTTGACCA ATTACATAGT GGTTCAAGAT ATGGAGGAGT TCAATTGGTT 19560 

TTGGCACCTC AGACAAGTCT GCCAAGACTT CCTCAACCAA TTCTTCTCGA ATGATTTCCT 19620 

GAGCTAACTC CACGCATTCA TTACAAATAA AAGCATTGTT GCCAGCAATT ATTTTTTGTA 19680 

CTTCTTCTTG GTTTTTGCCA CAAAATGAGC AATAAACCAT CATATCATTT TTTCTATTTG 19740 

TAGACATGAT TTCCTTCCAT TCTATACTGT CATTCTATCT AAAATAAGGT CATGTAAAAA 19800 

GCATGAATAC TATTGACCAG ATTGGTAAAG GCATTTAACC AAAGGAGGAT AGAAAGCCCG 19860 

TAACGCTTTT TACGAAAAGC TTGTGCTCCT GCCAGAAAGC AGATGAAACA CAGAAAAGCC 19920 

GTGAATAGAC CAAATAAACT CCGTTCCATT AGACTTCCTT TCTCTTGCGG TATTGGATGG 19980 

TAAAATCATA AGGATTCTTC TCATCTTTGG CGTAAAATTT GCTTGAAACT GTCTCAAAAA 20040 

GAGACAAGTC AAGTTCTTCA GGGAAATAGG TATCTCCTTC CACCCGAGCA TGAATGTGAG 20100 

TGACAATCAC TTCATCAAGG TAAGGTTCAA AAGCCTGAAA AATTTGCTTC CCACCGATAA 20160 

TGTAGAGATT CTTTTCTTGA GCCTGATACC AGTCAAGAAC AGACTGGACG TCCTGAAAAG 20220 

TAGCAACCCC ATCTATCTTT TCTTCCGGAT TACGCGTCAA AATCAAGGTT TCCCGTTTTG . 20280 

GAAGCAAGCG ACGCCCCATC CCATCAAAGG TCACACGCCC CATCAAGATA GCATGATTCA 20340 

GAGTTGTTTC TTTAAAGTGC TGCAATTCTG CTGGCAAATG CCAAGGCAGA CGATTTTCCT 20400 

TACCAATCAC ACCCTCTTCA TCCTGGGCCC AAATAGCTAC GATTTTCTTA GTCATGCTTC 20460 



WO 98/18931 



PCT/US97/19588 



502 

CATCCTTTTC ACTGATAGTA CTATTTTATC AAAAAACTCA AAAAAAGACT 
AGCTTACAAA ATAGAAAAAA TCTGTAAGAA ATTTCCTACA GATTTATCTA 
TTTCTTACAA ACCAGGTGCT TGTCCAAGTT CGGCTGCAAG CATCCAAATT 
TTTCAGTTTT AGCGCCTGCA AAGATACCGT TTGTCACATC GTCACCTTCT 
CATCCAAACC TTTTTGGAAA AGTTCTGACA AGTAACGGTA GATAACAAGA 
AGCTTTCTTC AACATTACGG TATTCACCAG CTTCTTCTTC GATTTCACTA 
ACTCTGTCAA TGTAGAGAAT GGGCTTCCAC CGAGTGTAAT CAAGCGTTCA 
CCAATTGACC GTCAAGAGCT TCCATGTACT CATCCATTTT TGGATGCCAT 
CACGACCATG CATATACCAG TGCACTTGGT GCAAAGCAAC GTGAGCTACA 
CAACAGCTTG GTTCAAGACT TCCTTTGTTT TTGCCAATGC 
(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2387 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



GGTTTGGAAT 
TGTTTCCTTA 
GTTTTATCTG 
TCATCAGTGA 
ACACGTTCCA 
TTTTGAAGGA 
CTGATTTCAT 
ACAAGGAAAC 
TACAAATCAG 



20520 
20580 
20640 
20700 
20760 
20820 
20880 
2O940 
21000 
21040 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
ATTCTTAATA CGATTAAAAG GCTTATTACT AAAAGAAAAT TTCAGTTAGA TGAACTAAAC 
TTGCTCGTCA AATCCCGATT TAACGAGATG TTTGGGGAAA ATAAAATATT TGAAAGCATT 
GATAACTTAT TTGATATTAT AGATGGTGAT AGGGGCAAAA ATTATCCTAA ATCAGATGAG 180 
TTGTTTAGTG AGGAGTACTG TTTATTTTTA AATACAAAGA ATGTTACTAA AAACGGATTT 240 
TCATTCGATA CAAAGCAATT TATCACTAAA ACAAAGGATA AATTACTTCG AAAAGGCAAA 
CTTGAGCGTT ATGATATAGT CTTGACAACA AGAGGTACTG TTGGAAATGT AGCGTACTAC 
GATGAATTAA TAAAATATAA ACATTTACGT ATAAATTCAG GTATGGTAAT ATTACGTCCC 
AAGACACCAA ATCTAAATCA GAAATTTATT ATCCATGTTT TAAGGAATAA TAATTATAGT 
CGAGTGATAT CAGGAAGTGC TCAGCCTCAG TTACCAATTA CAAAATTAAA AAAAATACTT 540 
CTCCCCCTCC CCCCACTAGC CCTCCAAAAT GAGTTCGCAG ACTTTGTAGT CCAGGTCGAC 600 
AAATCACAAT TGGCAATCCA AAAATCTCTG GAAGAACTTG AAACTTTGAA GAAATCTCTG 660 
ATGCAGGAGT ATTTTGGCTG ATATTCTGCC ATTGTAATTA CGGTAATGAT TTGTTATAAT 720 
ACTTCAAAGG AGGAAATCAG ATGGTAGTAA AAACAAGAAA ACAAGGAAAT TCAATCACCA 



60 
120 



300 
360 
420 
480 



780 



TTACGATTCC AAGTGAATTT AATATTCCAA GTGGTGTTAA ATACGAAGCG AAATTGTTAC 840 
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CAAGTGGTGA GATTATCTTT ACTCCTGAAG AATTGGGGCA GCAGGTTTCT TATGTATCTG 900 

ATGATGCCTT TGACTTAAAT TTAGATAAAA TATTTGACGA ATACGACGAT GTTTTCAAAG 960 

CTTTGGTGGA AAAATGACAA TCTATTTGAC AGAAAAGCAA ATTGAAAAAA TAAATGCTTT 1020 

AGCAATTCAA CGGTATTCTC CAAATGAGAA AATTCAAACA GTTAGTCCTT CTGCCTTAAA 1080 

TATGATTGTG AACTTACCAG AACAATTTGT CTTTGGGAAG CCTCTTTATC CAACAATTTT 1140 

TGATAAAGCA ACGATACTAT TTGTCCAATT GATAAAGAAG CATGTTTTTG CTAATGCTAA 1200 

TAAAAGAACT GCTTTCTTCG TTTTGGTCAA ATTTTTACAA TTAAACGGCT ATCGTTTTTC 1260 

TGTAACGGTA GAAGAAGCAG TAAAAATGTG TGTAACCATC GCAGTAGAAG CTTTAACTGA 1320 

TGAAAAAATG ACAAGCTACT CCAAATGGAT TTCTGAACAT TCTGTTAGAG AAAAGGTCAA 1380 

AAAGTAACCT AGTATGCTGG ATTTGAATGA GCACAAGAAA ATAAATGAAC AGACAATATT 1440 

AGAATTCTGT AATGCAGAAA CTGATATTGT CTCTTTTTAT TGATGAATAA GAAAGTGAGA 1500 

AATTATGGAA TCAAAAGTTA CAATTATCAT GCAAGAAATG TTACCTCTTT TAAATAATGA 1560 

ACAATTACTA GCGTTGAGAG AGAGTTTAGA ACATCATCTA GTAGACGGAA AAAAGCAGCA 1620 

GAAGTATTCG AATAATAACC TGTTGCAACT ATTTATTACC GCCAAGCAGG TAGAGGGCTG 1680 

TAGCTCAAAA ACAATTCGTT ATTATCAGAG GACGATTGAA AACTTGTTTA ATGCTATTAA 1740 

AGAGTCTGTG ACACAACTCA CAACAGATGA TTTAAGGAGT TATTTAGCAA ATTACCAGTC 1800 

TGAAAAGGAT TGTAGTAAGG CAAATTTAGA CAATATTAGG CGTATATTGT CTTCTTTTTT 1860 

TGCTTGGCTT GAGCAAGAGG ATATATCATT AAAATTCCCA TTCGACGGAT ACAGAAAATT 1920 

AAGACTGAGC AAAATGTGAA GGAAACTTAT ACTGATGAAC ATTTGGAAAT TATGCGTGAT 1980 

AACTGTGAAA ATTTGAGAGA TTTGGCAATA ATAGACCTAC TAGCATCGAC AGGTATGCGT 2040 

GTAGGGGAGC TTGTACAGTT GAATCGTTCA GATATTGATT TTGAAAACAG AGAGTGTGTT 2100 

GTCTTTGGTA AAGGAAAGAA GGAGAGACCA GTATATTTTG ACGCTCGTAC GAAAATTCAT 2160 

TTAAGAAATT ATCTTAACGA CAGAAAAGAT AGTCACCCTG CTCTTTTTGT AACGCTAGTT 2220 

GGAAAAGTCC AGAGGCTTGG AATTGCTGGT GTAGAGATTC GCTTAAGAAA GTTAGGAGAC 2280 

AAACTCGGCA TACAAAAGGT TCACCCACAT AAGTTCAGAA GAACTTTAGC GACTAAGGCA 2 340 

ATTGATAAAG GTATGCCTAT CGAACAAGTC CAAAAACTGC TAGGTCA 2387 
(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10669 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
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(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

ATATTAAAGC GACTTTCTGT GCGCTAGGGA AAAATGTTCC TGGGAATGAG GACTTGGTGA 60 

AGAGGATAAA ATCTGAAGGT CATGTTGTTG GAAACCATAG CTGGAGCCAT CCGATTCTCT 120 

CGCAACTCTC TCTTGATGAA GCTAAAAAGC AGATTACTGA TACTGAGGAT GTGCTAACTA 180 

AAGTGCTGGG TTCTAGTTCT AAACTCATGC GTCCACCTTA TGGTGCTATT ACAGATGATA 240 

TTCGCAATAG CTTGGATTTG AGCTTTATCA TGTGGGATGT GGATAGTCTG GACTGGAAGA 300 

GTAAAAATGA AGCATCTATT TTGACAGAAA TTCAGTATCA AGTAGCTAAT GGCTCTATCG 360 

TTTTGATGCA TGATATTCAC AGTCCGACAG TCAATGCCTT GCCAAGGGTC ATTGAGTATT 420 

TGAAAAATCA AGGTTATACC TTTGTGACCA TACCAGAGAT GCTCAATACT CGCCTAAAAG 480 

CTCATGAGCT GTACTATAGT CGTGATGAAT AAGCAAGAAA AAATAGGTCT GTTAGATATT 540 

TGACAGACTT ATTTTTTACA GAATATAGTA CTACTTAAAA AATGTTTTAT GCTATAATTG 600 

ATGAATAAAA TAGAAGGAGA AGCATATGAA TACCTATCAA TTAAATAATG GAGTAGAAAT 660 

TCCAGTATTG GGATTTGGAA CTTTTAAGGC TAAGGATGGA GAAGAAGCCT ATCGTGCAGT 720 

GTTAGAAGCC TTGAAGGCTG GTTATCGTCA TATTGATACG GCGGCGATTT ATCAGAATGA 780 

AGAAAGTGTT GGTCAAGCAA TCAAAGATAG CGGAGTTCCA CGTGAAGAAA TGTTCGTAAC 840 

TACCAAGCTT TGGAATAGTC AGCAAACCTA TGAGCAAACT CGTCAAGCTT TGGAAAAATC 900 

TATAGAAAAA CTGGGCTTGG ATTATTTGGA TTTGTATTTG ATTCATTGGC CGAACCCAAA 960 

ACCGCTCAGA GAAAATGACG CATGGAAAAC TCGCAATGCG GAAGTTTGGA GAGCGATGGA 1020 

AGACCTCTAT CAAGAAGGGA AAATCCGTGC TATCGGCGTT AGCAATTTTC TTCCCCATCA 1080 

TTTGGATGCC TTGCTTGAAA CTGCAACTAT CGTTCCTGCG GTCAATCAAG TTCGCTTGGC 1140 

GCCAGGTGTG TATCAAGATC AAGTCGTAGC TTACTGTCGT GAAAAGGGAA TTTTATTGGA 1200 

AGCTTGGGGG CCTTTTGGAC AAGGAGAACT GTTTGATAGC AAGCAAGTCC AAGAAATAGC 1260 

AGCAAATCAC GGAAAATCGG TTGCTCAGAT AGCCTTGGCC TGGAGCTTGG CAGAAGGATT 1320 

TTTACCACTT CCAAAATCTG TCACAACCTC TCGTATTCAA GCTAATCTTG ATTGCTTTGG 1380 

AATTGAACTG AGTCATGAGG AGAGAGAAAC CTTAAAAACG ATTGCTGTTC AATCGGGTGC 1440 

TCCACGAGTT GATGATGTGG ATTTCTAGAA AATCATAAAA AGAATTGTAC ATTATTCTAA 1500 

TTTTTGATAT AATAGTCAGC AGGAAAGAAA GTCTTATGGC GTTCTTCAAG CGAGCTTGGG 1560 

ATAGTGGGAG CCAAGTAGGG CAAAATAAAG GGCTGGCGCT TTCTGTAGTA TTTTCAAAAA 1620 
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CAATGAAGTA ATAAATTAGG GTGGAACCGC GTTTCTGACG CCCCTAGGTT AAATCAACCT 1680 

AGGATTGTCA GATGTGGTTC TTTTGCTTAT TCAGTCTATT GTGTGAAAGA AAGGAGAGCC 1740 

GTGGACAACC TTTATCTTGT AAAAGACGAT AGTCAACTAG CTACATTTCG TGATTTTGTA 1800 

GTAAGAAATA CTGAAAAGTT GAAAGATTAT CAATCTTTTT TAAAGAATGA ACTTGCAGTC I860 

TGTGATTTAC CGCAAGCTGT TATTTGGTCA GATTTTAATG CTGCTACACA GATTATTAGG 1920 

GAAAGTGCTG TTCCAACCTA TACAAATAAT AGACGAGTGG TTATGACGCC TGATTTAGCT 1980 

GTTTGGAAAG AATTGTATTT GTATCAGTTG ATGGACTACG AGTGTTCTGA GCAAACTCAA 2040 

GCAATAGAAA GTCACTATCA TTCTTTATCT GAAAATTTCC TCTTACAGAT TGTAGGACAT 2100 

GAGTTAGCTC ATTGGTCGGA CATTTTTTAG ATGATTTTGA TGGTTATGAC TCTTATATCT 2160 

GGTTCGAAGA GGGGATGGTT GAATATATTA GTCGCAAGTA TTTCTTGACA GAAGAGGAAT 2220 

TTCAAGCGGA AAAAATTTGT AATCAATCTC TCGTAGAACT TTTTCAGAAG AAGTATAGTT 2280 

GGCATTCATT GAATGATTTT GGTTCTTCGA CTTATGATAA GAACTATGCA AGTATTTTTT 2340 

ATGAATACTG GCGCAGCTTT TTGACAGTAG ATAAGTTGGT AGAAAATTTA GGTAGTGTAC 2400 

AAGCGGTCTT AGATTCTTAT CATTTATGGG CAAATACAGA AAAAACTTTT CCCTTGTTAG 2460 

ATTGGTTTGT TCAGCAGAAA TTAATTGAAA AAGAAATATA AAAACTAAAG GAGTAAACAA 2520 

TGTCTAAGAA ATTAACATTT CACTGCATCA GTGGCAGAGA CCTCCTTACA GTCGGGCTGC 2580 

TCCACGCTCA GCACTAGAGT GCCTGAGCTA GACGCAGTAC TAACTCGTCT TGCCTCGTAT 2640 

GATCGACGAG GCAGACTCGT GTCGCAAGTA ATTATTTTTT ATTAAGGAGT ATTCAATGTC 2700 

TAAGAAATTA, ACATTTCACT GCGTCAGTGG CAGAAACCTC CTTACAGTCG GACTGCCCTA 2760 

CGCTCAGCAC TAGAGTGCCT GAGCTAGACG CAGTACTAAC TCGTCTTGCC TCGTATAATC 2820 

GACGAGGCAG ACTCGTGTCG CAAGAAATTA TTTTTTATTA AGGAGTATTC AATGTCTAAG 2 880 

AAATTAACAT TTCAAGAAAT TATTTTGACT TTGCAACAAT TTTGGAATGA CCAAGATTGT 2940 

ATGCTTATGC AGGCTTATGA TAATGAAAAA GGTGCGGGGA CAATGAGTCC TTACACTTTC 3000 

CTTCGTGCTA TCGGACCTGA GCCATGGAAT GCAGCTTATG TAGAGCCATC ACGTCGTCCT 3060 

GCTGACGGTC GTTATGGGGA AAACCCTAAC CGTCTCTACC AACACCACCA ATTCCAGGTG 3120 

GTCATGAAGC CTTCTCCATC AAATATCCAA GAACTTTACC TTGAGTCTTT GGAAAAATTG 3180 

GGAATCAATC CTTTGGAGCA CGATATTCGT TTTGTTGAGG ACAACTGGGA AAACCCATCA 3240 

ACTGGTTCAG CTGGTCTTGG TTGGGAAGTT TGGCTTGACG GAATGGAAAT CACTCAGTTC 3300 

ACTTATTTCC AACAAGTCGG TGGATTGGCA ACTGGCCCTG TGACTGCGGA AGTTACCTAT 3360 
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GGTTTGGAGC GCTTQGCTTC TTACATTCAA GAAGTAGACT CTGTCTATGA TATCGAGTGG 
GCTGATGGTG TAAAATACGG AGAAATCTTT ATCCAGCCTG AGTATGAGCA CTCAAAATAT 
TCATTTGAAA TTTCGGACCA AGAAATGTTG CTTGAAAACT TTGATAAGTT TGAAAAAGAA 
GCTGGTCGTG CATTAGAAGA AGGCTTGGTA CACCCTGCCT ATGACTATGT TCTCAAATGT 
TCACATACCT TTAATCTGCT TGACGCGCGT GGTGCCGTAT CTGTAACAGA GCGTGCAGGC 
TATATCGCTC GTATCCGTAA CTTGGCCCGT GTCGTAGCCA AAACCTTTGT CGCAGAACGC 
AAACGCCTAG GCTACCCACT TTTGGATGAA GAAACAAGAG CTAAACTCCT AGCAGAAGAC 
GCAGAATAAA GAGAGTGACA AATTACGAAA ATGGGCGAAC AGAGTGAGCC CTGAGCCAGT 
TGCCGCAGTG ATGAAGGTAT CCTTAGTGAA ACTAAGGATA CTAGGCAAAA TTGGAGACTT 
TTGGCTCCAA TTTTAGCAAT GAAACAACGA AGTTGGTTGC TTGCGTGCCA ATCACATAAG 
GCAAACTGGA AAATAAAAAG ATACTTTTCG GAGAAAAAAC ATGACAAAAA ACTTATTAGT 
AGAACTCGGT CTTGAAGAAT TACCAGCCTA TGTTGTTACG CCAAGTGAAA AACAACTAGG 
CGAAAAAATG GCAGCCTTCC TCAAGGGAAA ACGCCTGTCT TTTG AAG CCA TTCAAACTTT 
CTCAACACCA CGTCGTTTGG CTGTTCGTGT AACTGGTCTT GCAGACAAAC AGTCTGATTT 
AACAGAAGAT TTCAAGGGTC CAGCAAAGAA AATTGCCTTA GATAGTGATG GAAACTTCAC 
CAAAGCAGCT CAAGGATTTG TCCGTGGGAA AGGTTTGACT GTTGAAGATA TCGAATTCCG 
TGAAATCAAG GGTGAAGAAT ATGTCTATGT CACTAAGGAA GAAATTGGTC AAGCAGTTGA 
AGCCATTGTT CCAGGCATTG TGGATGTCTT GAAGTCACTG ACTTTCCCTG TCAGCATGCA 
CTGGGCGGGA AATAGCTTTG AATACATCCG CCCTGTTCAC ACTTTAACTG TTCTCTTGGA 
TGAGCAAGAG TTTGACTTGG ATTTCCTTGA TATCAAGGGA AGTCGTGTGA GTCGTGGCCA 
TCGTTTTTTG GGACAAGAAA CCAAGATTCA GTCAGCATTG AGCTATGAAG AAGACCTTCG 
TAAGCAGTTT GTAATCGCAG ATCCATGTGA ACGTGAGCAA ATGATTGTTG ACCAAATCAA 
GGAAATTGAG GCAAAACATG GTGTACGTAT CGAAATTGAT GCGGATTTGC TGAATGAAGT 
CTTGAATTTG GTTGAATACC CAACTGCCTT CATGGGAAGT TTTGATGCTA AATACCTTGA 
AGTTCCAGAA GAAGTCTTGG TGACTTCTAT GAAGGAACAC CAGCGTTACT TTGTTGTTCG 
TGATCAAGAT GGAAAACTCT TGCCAAACTT CATTTCTGTT CGTAACGGAA ACGCAGAGCG 
TTTGAAAAAT GTCATCAAAG GAAATGAAAA AGTCTTGGTA GCCCGCTTGG AAGACGGAGA 
ATTCTTCTGG CGTGAAGACC AAAAATTGGT GATTTCAGAT CTTGTTGAAA AATTAAACAA 
TGTCACCTTC CATGAGAAGA TTGGTTCTCT TCGTGAACAC ATGATTCGTA CGGGTCAAAT 
CACTGTACTT TTGGCAGAAA AAGCTAGTTT GTCAGTGGAT GAAACAGTTG ACCTTGCTCG 
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TGCAGCAGCC ATTTACAAGT TTGACTTGTT GACAGGTATG GTTGGTGAAT TTGACGAACT 
CCAAGGAATT ATGGGTGAAA AATACACCCT TCTTGCTGGT GAAACTCCAG CGGTGGCAGC 
TGCTATTCGT GAACACTACA TGCCTACATC AGCTGAAGGA GAACTTCCAG AGAGCAAGGT 
CGGCGCAGTT CTAGCCATTG CAGACAAATT GGATACGATT TTGAGTTTCT TCTCAGTAGG 
ATTGATTCCA TCAGGTTCTA ATGACCCTTA TGCCCTTCGT CGTGCAACTC AAGGTGTGGT 
TCGTATCTTG GATGCCTTTG GTTGGCACAT TGCTATGGAT GAGCTGATTG ATAGCCTTTA 
TGCATTGAAA TTTGACAGTT TGACTTATGA AAATAAAGCA GAGGTTATGG ACTTTATCAA 
GGCTCGTGTT GATAAGATGA TGGGCTCTAC TCCAAAAGAT ATCAAGGAAG CAGTTCTTGC 
AGGTTCAAAC TTTGTTGTGG CAGATATGTT GGAAGCAGCA AGTGCTCTCG TAGAAGTAAG 
CAAGGAAGAA GATTTTAAAC CATCTGTTGA ATCACTTTCT CGTGCCTTTA ACCTGGCCGA 
GAAGGCAGAA GGGGTTGCTA CGGTTGATTC AGCACTATTT GAGAATGACC AAGAAAAAGC 
TTTGGCAGAA GCAGTAGAAA CACTCATTTT ATCAGGACCT GCAAGTCAGC AATTGAAACA 
ACTTTTTGCG CTTAGCCCAG TCATTGATGC TTTCTTTGAA AATACTATGG TAATGGCTGA 
AGATCAGGCT GTCCGTCAAA ATCGTTTGGC AATCTTGTCA CAACTAACCA AGAAAGCAGC 
TAAGTTTGCT TGTTTTAACC AAATTAACAC TAAATAAAAT TTGATAAACG GACTTTATCT 
TATTACAAAG GAGAAGAAAT GGATCCGAAA AAAATTGCTC GTATCAATGA GCTTGCTAAA 
AAGAAAAAAA CAGAAGGCTT AACACCAGAA GAAAAAGTGG AACAAGCCAA ACTACGTGAG 
GAGTACATCG AAGGTTATCG CCGCGCTGTT CGTCACCACA TTGAAGGAAT CAAAATTGTG 
GACGAAGAAG GAAACGATGT TACACCAGAA AAACTACGCC AAGTACAACG TGAAAAAGGA 
TTACATGGCC GTAGTCTTGA TGATCCAAAT TCATAATAAT ACTCTTCGAA AATCAAATTC 
AAACCACGTC AGCTTCACCT TGCCGTACTT AAGTACAGCC TGCGGCTAGC TTCCTAGTTT 
GCTCTTTGAT TTTCATTGAG TATATGTATT CTTTCTTTTA ACAAAGATAG ATGAAACGAT 
AACAAAGAGA CTAGCAGTTT GTGTTTGCTA GTCTTTTTTC GCTAAAAAAG GAACCATAAT 
GGTTCCTAAA AACTATCATT AGTAACTTGC ACCGGCTGTA GCGTCTGCGT CACCACCGTG 
GCCTCCAGCA TCCCCTGAAT CAGAAGCGCC AGAAGTAGCA TCGGCGTCTC CATGACCTCC 
GGCAGCAGGA GCAAATGGTC CGCTACCACC CACCAAACGT TGACCAGTCT CTTTTAGGTA 
CCAGTCAAGC CATGGTTGGA AGTTAAAGAC GATTTCATTG ATACCAGCGT ATGATCCATC 
AGGATAGTAC ATTGCTTGGT AGTTGTGAGT GTTGATAACA CCTGCAGGAG AACCTGGAAC 
GATCGTACGG ACGTATTCTT GGTTTCCGTT GCGAAGTGTT CCGATAACCC ACTCTACGTT 
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CTTCATACGT GCTGGTGGAA GAGAACCATG AACAGTCGAC 


ATACGGCTAC 


CTGATTGAGG 


6960 


TGGTACACGT TTAGCGAACA TAGTGTCTGG ATCTTGGTGA 


GCGTTGTTGT 


AGTAGAGGAA 


7020 


TTGGTTGTTG TCGTCAGCGT ATGTCAATTC AAATGGCATA 


GCTTTCAAGA 


ACATATCAAT 


7080 


TTGGTTAACT GTTAGGATAC CGTGGTCCAA TTTGACATAG 


GTATCACCAG 


AAACAGCACC 


7140 


AGTGAATGCT GCAACTTTTT CTACCCATTC TGGATCGTCA 


GGGTCAACTT 


CTGTGATGGT 


7200 


TGTAGCGATT GGTTTTCCAC AATCCAAGTC TTCTGATTCG 


ATTGGTTTTG 


GTTTTTTCAA 


7260 


TTTCGAAACG ACTCCTACGT ATTTAACAAA GTTATCTAAG 


CAAGTTTCAA 


GGAATTTAAC 


7320 


AGTGCCTTCG TTGGTGATAT TTCCGTTGTT ATCAAAAGCT 


TCCTTAGCTT 


TACCAAGAAG 


7380 


GAATTCGTTA CCTGGAAGCG TGTAGGCATT AACACCTGGA 


GCATCAAGGA 


X A A A {X\+\tt%t\\J 


7 A a n 


GTGAACTTGA GCACGTGATG TTCCTTGGTC ATAGTATGAT 


GCACCCACAA 


TCATAACAGfi 




CTTGTTTTCA AATGGATGAA CTTCGTATGA AAGCCATTCA 


AGTACAGATT 


TGAGTGAAfir 




TGAGATAGTG TGGTTATGCT CAGGAGTAGC AATGATAACA 


CCATCTGCAC 


GAGTAATTTT 
wwj x n*\ 1111 


ion 


GTTATATAAA TAACGTAATT GGAAACTTTC ATCCCATTTT 


TCATCTTGGT 


TAAACATTGG 


/ oou 


AACTTCGTCA ATTTCAAGAA CTTCTAATTC AAATTTGAGT TTGAAGTAGC 


GACGGATAAA 


11 A Ci 
/ f 4 \J 


TTCCAAGAGC TTACGGTTAT ATGATTGATC GTAGTTTGAT 


CCAACAAGTC 


CAACAAATTT 


i ft nn 


CATTCTTTTT GGTCTCCTAT CTTACAAATT TTCCCAGTCA 


AAGTCTTCAG 


^— *» 1 U 1 JL X VJ\-\J 


7ft^n 
/sou 


AAGTAATTCT TGTGCATTAC GTAATTTTTC TGTGATTTTT 


ACAAAGATAC 


GGAAGTCATP 




AAAGATGGCA TCCAATTTCT TGATAACATC AAGGTCAACC 


AAGTCGCCAC 


TTGGGTT A A A 


i q Rn 


TGCTTGAAGA GAGTGTGAGA GCAAGAATTC ATCTGGAAGA 


ACATTTGCCT 


x on xxx LnUu 




AGCATTCAAG ATTTGACGAA GTTGCAATTG GGCACGAGAT 


GAACCAAGCG 


TACCGTAAGA 


OiUU 


AGCACCTGTA ATCATGATTG GTTTGTTCAA AAGTGGGTAA 


ATACCATAAG 


ACAACCAAGC 


8160 


AAGAGCGCTC ATCAAAACAG CTGGAATAGA GTGATCATAC 


TCAGGAGTAC 


CGATAATAAC 


8220 


GCCATCTGCC TCTTCGATTT TAGCAGCAAT TTCCAATATT 


TCAGCAGGTA 


CTTGCTTGTC 


8280 


AGCTGGTTTG TTGAAGACAG GAATGGCCTT GATTTCAACA 


AGTTCAATTT 


CAGCTTTGTC 


8340 


AGTAAAGTGT TTTTGCATGT ATTGAAGCAA TTGACGGTTT 


GTAGAACGTT 


TTGAATTTGT 


8400 


TCCAACAATA GCAATAAGTT TTAACATGAG ATTTCCTTTC 


TCTTTTTACA 


TAATACAATT 


8460 


TTAAAATTCC ATTGAAACAG TTGTCTCTAT AGAGTAGGAA 


TTCCTGAAGA 


ACAGCTTAGG 


8520 


TGGCCTTCTT TATCGATGAG GATGACTTCG ATGCCCTCCA 


AACTTTCGAC 


TTGCCAGAGG 


8580 


ATAGAAGCAG GTCTTTCTCC AAAGAGTCGA GTCGTCCAGA 


TTTCGCCATC 


GACTGATTTA 


8640 


TCAGAGATGA TTGTTAGACT CGCTAGTTCC GTTTCAACAG 


GATATCCTGT 


TTGACTGTCA 


8700 
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AAAATGTGAT 


GGTAATCTTG 


TCCATCGACG 


GTCAGGTGAC 


GTTCATAAAT 


GCCTGAAGTC 


8760 


ACGACAGATT 


TATTGACAAC 


AGGGATGGTC 


ATTAAATGAT 


TTCCCCTAGG 


ATTGGCTGGG 


8820 


TCTTGAATCC 


CGATTTGCCA 


TGGGTTATCC 


CCTCTTGCCT 


GATTTTTTCC 


AATGGTCAGG 


8880 


ATATTCCCTC 


CCAGATTGAT 


CAAGGCAGAA 


GTCACCCCCT 


CTTTCCTAAG 


AAATTGGGCA 


8940 


ACCTTATCCG 


CACTGTATCC 


TTTGGCTAAA 


CAACCTAGAT 


CGATCTTCAT 


TCCTTTCTGT 


9000 


TTTAAAAACA 


CAGTAGAAGT 


AGAAGAATCT 


AACTCGATAC 


CATGAGGATT 


GATTAGAGGC 


9060 


AGCACCGATT 


CAATTTCTTG 


AGGCTGGGCG 


ACCTTGGCAT 


CTGAAAAACC 


GATACGCCAG 


9120 


GTTTGAATTA 


AGGGACCAAT 


GCTGATATTG 


AGGTGGCTAG 


AGAGCGCTAG 


GCTATGCTCT 


9180 


AACCCAAGTG 


AAATCAGCTC 


AAACAGGTCT 


GGATGAACCG 


TGACGGGGGC 


TATTCCTGCT 


9240 


TGATAATTGA 


TTTCCATCAA 


CTCAGATTCT 


TGACTATTGG 


CGTTGAAGCG 


GTATTCAAGT 


9300 


TCTTTGAGCA AGTCAAAGGA TTTTTGGAGA AAGATATCGG CTTGCTCATC 


CACTAATGAA 


9360 


ATAGTGATAG 


TAGTCCCCAT 


TAGCCGTTCA 


GAATGTGAAC 


GAAGAGTCAA 


GCTACCAACT 


9420 


CCTTTCTCTT 


ATAGAAAATA 


AGTTGTAATA 


TCAAATAATC 


ATCTAAATTG 


AAGCCCTTAC 


9480 


ATTTCATTTT 


CATGTTATTA 


TAATACCATA 


AAGTTAGAAT 


TTTCACAAAC 


AAAATTTGGA 


9540 


AAAAGTCAAG 


AAATATGCTC 


ATAAAATTCA 


TCAGGCTTGA 


AAACAGGATA 


AATGGGGAAT 


9600 


TATTTTTGAT 


AAAAAATGCT 


GAAATAATAG 


TACCCCCCTT 


GTAAACGCTA 


ACGGTAAATG 


9660 


GTATACTAGT 


AAGGTAAATT 


TAGAATGAAG 


GCAGGAAATT 


TTTATGAGTA 


AAATCGTTGT 


9720 


AGTCGGTGCT 


AACCACGCTG 


GTACAGCATG 


TATCAATACC 


ATGTTGGATA 


ATTTTGGAAA 


9780 


TGAGAACGAA 


ATTGTTGTAT 


TTGACCAAAA 


CTCTAACATC 


TCTTTCCTAG 


GATGTGGAAT 


9840 


GGCTCTTTGG 


ATTGGTGAAC 


AAATTGACGG 


TGCTGAAGGC 


TTGTTCTATT 


CTGATAAAGA 


9900 


AAAATTGGAA 


GCTAAAGGTG 


CTAAAGTTTA 


CATGAACTCA 


CCTGTTCTTT 


CAATCGACTA 


9960 


TGATAACAAA 


GTAGTTACAG 


CGGAAGTTGA 


AGGAAAAGAG 


CACAAAGAAT 


CATACGAAAA - 


10020 


ATTGATTTTC 


GCTACAGGCT 


CTACACCAAT 


CTTGCCACCA 


ATCGAAGGTG 


TTGAAATTGT 


10080 


TAAAGGAAAC 


CGCGAATTTA AAGCAACTCT 


TGAAAACGTA 


CAATTCGTGA 


AATTGTACCA 


10140 


AAATGCTGAA GAAGTTATCA ATAAACTTTC 


TGACAAGAGC 


CAACACCTCG 


ACCGTATCGC 


10200 


CGTTGTTGGT 


GGTGGTTACA 


TCGGTGTTGA 


ACTTGCTGAA 


GCCTTTGAAC 


GTCTTGGAAA 


10260 


AGAAGTTGTC 


CTTGTTGATA 


TCGTTGATAC 


TGTCTTGAAC 


GGTTACTATG 


ACAAAGACTT 


10320 


CACACAAATG 


ATGGCGAAGA 


ACTTGGAAGA 


TCACAACATC 


CGCTTGGCTC 


TAGGTCAAAC 


10380 


TGTTAAAGCA 


ATCGAAGGTG 


ACGGTAAAGT 


TGAACGCTTG 


ATTACTGACA 


AAGAAAGCTT 


10440 
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TGACGTGGAT ATGGTTATCC TTGCAGTTGG TTTCCGTCCA AACACAGCCC TTGCAGGTGG 10500 

TAAGATCGAA CTCTTCCGCA ACGGTGCCTT CCTTGTAGAC AAGAAACAAG AAACATCTAT 10560 

CCCAGACGTT TACGCTGTTG GTGACTGTGC GACTGTTTAT GACAATGCTC GTAAAGATAC 10620 

AAGCTATATC GCTCTTGCTT CAAATGCTGT GCGCACTGGT AACGTTGGT 10669 



(2) INFORMATION FOR SEQ ID NO: 58: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7542 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

CGCGCTAATA GATACTTTAT GATAGAATAA AGAACAAGAT TGACAAGTAA GAGGAAACAT 60 

TATGCAAAAT CAAACACTCA TGCAATACTT TGAATGGTAT CTGCCCCACG ACGGTCAACA 120 

CTGGACGCGT CTGGCTGAAA ATGCTCCACA CCTAGCTCAT CTGGGGATCA GTCACGTCTG 180 

GATGCCACCA GCCTTCAAGG CAACCAACGA AAAAGATGTC GGCTATGGGG TCTATGACTT 240 

ATTTGACTTA GGAGAGTTCA ACCAAAAAGG GACTGTCCGC ACCAAGTATG GTTTCAAAGA 300 

AGACTATCTT CAAGCCATTC AAGCCCTTAA AGCACAGGGA ATTCAACCTA TGGCCGATGT 360 

AGTTCTCAAC CACAAGGCTG CTGCCGATCA CAGGGAAGCC TTTCAGGTTA TCGAAGTTGA 420 

TCCTGTAGAC CGTACAGTTG AACTTGGAGA ACCCTTCACC ATCAATGGCT GGACTAGTTT 480 

TACCTTCGAT GGTCGCCAAG ATACCTATAA TGGCTTCCAC TGGCATTGGT ACCACTTCAC 540 

CGGTACAGAC TACGATGCCA AACGCAGTAA ATCTGGGATT TATCTGATCC AAGGGGACAA 600 

CAAGGGCTGG GCCAACGAGG AATTGGTCGA TAACGAAAAC GGAAACTACG ACTACCTCAT 660 

GTATGCCGAC CTAGACTTTA AACATCCTGA AGTCATCCAA AACATCTATG ACTGGGCTGA 720 

TTGGTTCATG GAAACGACTG GTGTAGCTGG TTTCCGTTTG GATGCCGTTA AGCATATTGA 780 

CTCTTTCTTT ATGCGCAACT TCATCCGCGA TATGAAGGAA AAATACGGTG ACGATTTCTA 840 

TGTTTTTGGT GAATTTTGGA ACCCAGACAA GGAAGCCAAT CTGGACTATC TCGAAAAAAC 900 

GGAAGAACAC TTTGACCTTG TCGATGTTCG TCTCCACCAG AATCTCTTTG AAGCCAGTCA 960 

AGCTGGCGCA AACTATGACC TTCGTGGCAT TTTCACAGAT AGCCTGGTTG AACTCAAGCC 1020 

TGACAAGGCT GTGACTTTTG TCGACAACCA CGATACCCAA CGAGGACAAG CCCTTGAGTC 1080 

TACCGTTGAA GAATGGTTCA AGCCAGCAGC CTATGCCCTC ATTTTGTTAC GCCAAGACGG 1140 

CCTTCCATGT GTCTTTTACG GAGACTACTA TGGGATTTCA GGGCAGTATG CTCAAGAAGA 1200 
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TTTCAAAGAA 


ATCCTTGACC 


GCCTCCTAGC 


CATCCGAAAA 


GATTTGGCCT 


ATGGAGAACA 


1260 


AAATGACTAC 


TTTGACCATG 


CTAACTGTAT 


CGGTTGGGTA 


CGTTCAGGTG 


CTGAAAATCA 


1320 


ATCCCCAATC 


GCAGTCCTTA 


TCTCAAATGA 


CCAAGAAAAC 


AGCAAGTCAA 


TGTTTGTCGG 


1380 


TCAAGAATGG 


ACTAATCAAA 


CCTTTGTAGA 


TTTACTTGGT 


AACCACCAAG 


GTCAAGTTAC 


1440 


AATTGATGAG 


GAAGGTTATG 


GACAATTCCC 


TGTCTCAGCT 


AGATCCGTAA 


GTGTCTGGGC 


1500 


AGTCAATACC 


ATCTAATAGC 


TCATAATAAC 


CAAGCTAGGT 


CCAAGCGGAT 


TTGGCTTTTT 


1560 


TGTATTCACA 


AAAAGACCTA 


CCCAAATGGA 


TAGATCTTTA 


CTTGATTACA 


ATTTACCTGC 


1620 


TACTGCATCC 


AACAATTCTT 


GGATCTTAGG 


TTGGTTGCTT 


CCTCCTGCCA 


TGGCCATATC 


1680 


TGGTTTACCA 


CCACCACGTC 


CATCGATGAT 


TGGTGCTAAT 


TCTTTGACAA 


GGTTTCCTGC 


1740 


ATGAAGGTCT 


TTTGTCTTGC 


TTGCTACAAG 


GACATTGACT 


TTGTCACCGA 


TAGCGGCAAC 


1800 


TAGGACAAGA 


AGATCAGAGT 


AGTCTTTTTG 


TTTCCAGTTA 


TCTGCAAAAG 


TACGAAGGGC 


1860 


ACCGGCATCG 


GATACAGACA 


CTTGACTAGC 


AATGTAACGA 


TGACCGTTGA 


CTTCCTTAAC 


1920 


ATCTTTGAAG 


ATATCGCCTG 


CGGCTGCAGC 


TGCGGCTTTT 


TCTTTCAACT 


CAGCATTTTC 


1980 


TTTTTGAAGT 


TGACGAAGTT 


GTTCTTGAAG 


TCCTTCTACC 


TTGTGAGGTA 


CTTCCTTGAC 


2040 


TTGAGGTGCT 


TTCAAGGTTG 


CTGCGATAGC 


TTTAAGAGCA 


TCCTCTTGTT 


CACGATAGGC 


2100 


TTCAAAGGCT 


TCCTTACCAG 


TCACTGCCAA 


GATACGGCGA 


GTTCCTGAAC 


CGATTCCTTC 


2160 


TTCTTTGACA 


ATTTTGAAGA 


GACCAATCTC 


AGAAGTGTTG 


TCAACATGAG 


TACCACCACA 


2220 


AAGTTCAATA 


GAGTAGTCAC 


CGATAGTCAC 


GACACGAACT 


TCCTTGCCGT 


ATTTCTCACC 


2280 


AAAGAGGGCC 


ATAGCTCCCA 


TTTCTTTAGC 


AGTGTCAATA 


TCCGTTTCAA 


CTGTCTTCAC 


2340 


TTCAAGTGCT 


TCCCAAATTT 


TCTCGTTAAC 


TTGCTGTTCA 


ATCGCACGAA 


GTTCCTCAGC 


2400 


AGTTACTGCT 


TGGAAGTGGG 


TAAAGTCAAA 


GCGAAGGAAT 


TCAACTTCGT 


TAAGAGATCC 


2460 


TGCCTGTGTT 


GCGTGGTTTC 


CAAGGATATT 


GTGAAGGGCA 


GCGTGAAGCA 


AATGAGTCGC 


2520 


AGTGTGGTTT 


TTCATGACAC 


GGTGACGGCG 


ATTGCTATCA 


ATTGCCAAGG 


TATATTCTTG 


2580 


GTTCAAGGCA AGCGGTGCAA 


GGACTTCAAC 


TGTATGAAGG 


GCTTGACCAT 


TTGGGGCTTT 


2640 


CTGAACATTG 


GTCACAGTAG 


CCACAACCTT 


ACCTGACTCA TCCAAGATTT 


GTCCGTAGTC 


2700 


AGCTACCTGT 


CCACCCATTT 


CAGCATAAAA 


TGACGTTTCC 


GCAAAGATAA 


GAGAGGCAGT 


2760 


TCCTTCTGAA 


ACAGCTCCTA 


CTTCTGCATT 


GTCAGCAACG 


ATAGCTACCA 


ATTTAGAAGA 


2820 


CAATTGGCTA 


GCATTGTAGT 


TGAAGACACT 


TTCTACAGTG 


ATGTTTTGAA 


GAGTTTCATT 


2880 


TTGCATACCC 


ATTGAGCCAC 


CCTTGACAGC 


TGACGCACGC 


GCGCGTTCTT 


GCTGTTCTTT 


2940 
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CATGGCTGCT 

AGTCAATTCA 

AACAGATTGA 

GTGAAGGGTA 

TTTCTCAAGC 

GTAAAGGAAA 

ACGACGAAGA 

TGAAAGAGAA 

ATAAACCTTA 

AAAGTTGGTC 

AATGTTCTTA 

TTGTGACAAA 

CAGGCGAAGA 

ATCTGGTCCA 

ACTTGGATCC 

GGTCATGTAA 

AGCCCAAGTG 

AAACATGGTA 
CTTTTGGGCA 

AAGGGTTGCT 
TACTGATGGT 
TTGTGCACTA 
TTCCAGCATT 
GTCTTCAAGA 
CGCAATCAAC 
GATACGAAGA 
AATGACAAAA 
TTCTTTACTA 
AATCTTAAAA 
GAAGCGTTCG 



TCAAAACCTT 
ACTGGGAACC 
CCTTTTTCTT 
CGGGCAAATG 
ACTTCTGGGT 
GGCTCGTTGA 
ACATAACCAC 
CGAATGTGGT 
CCAGACAATT 
TTAGCCCCTT 
TGTGGCAATT 
ACGATGTTCC 
CCGATATTTT 
GAAGGTCCCG 
ACTCCCACTT 
AGTTTTTCAG 
ATAGCTTCGT 
TGGTGACGCG 
TTGGTAATAC 
ACCCCAGAGT 
TCTACTGAGT 
GATAGTTGTT 
TCCACATAGT 
ACCGTTACGG 
TGATCGCGAT 
CCTAGATTAT 
CTCGAGCCAT 
ATCTCACTGA 
AATGATCCCT 
CCTCCAAGAC 



CACGGTCTAC 
CATAAGTATC 
TCAAGTCTGC 
ATTCTTCTTC 
AGTAGCTTTC 
TACCCAATTT 
GACCTTCATT 
CTGCGATAAC 
TCTCGACTTC 
GGATAACGGC 
CCTTGTATTC 
AGATTTCAAT 
CTGGGTCAAA 
CACCGATTTC 
CAATCCAGCG 
CAGGGAAATC 
CACGGAAGTA 
CGGTCTTCCC 
GTGGATTTTC 
TGATCCACAA 
GACCTTTGGT 
TCATATTGTC 
CAATCGCGAC 
TATAGGTAGA 
CATCCAGCAA 
CAAACTCATA 
CCCGAAGCTG 
CTTGTTCACC 
CCACCTGATA 
GAAACTTTTG 



512 
AGTCATACCA 



ATAGAGTTTG 
TACAATGCCT 
GCTCTTAACG 
CATGATTTTT 
TTGACCATGC 
TCCTGGAAGG 
CTTGAAGCTC 
ACGGATAATC 
CACCAAACGC 
GCTACGAGGA 
ATAACGGTCG 
GGCTTCCCCA 
CCAGAAGTTG 
GTTGTAAGAA 
AAACCATTCA 
ATCCCCGATA 
TACGTTTTCG 
AGGGATAATG 
AAGAGTTGGG 
CGCCCAGAAA 
TCCTTATTCA 
ACAGAGGGAA 
AGTCAGATGG 
TTTGAAATTC 
CTTATCTCGC 
AATTTCAAAA 
AGCCGCATCA 
GGCAATTTCT 
TTTGACAAGA 



GCTTCTTCAG 
AAGACATCTG 
TGGGCAAAGT 
ATTTTCTCAA 
CCAACAGTTG 
ATAGAAGCAC 
GCACCATCAC 
ATGTTGTCGC 
GGCATGAAGA 
TCCAAACCAG 
ACAGCAGGGT 
TTTTCAATAT 
CGGTCAAAGA 
TCCTCAATTG 
TCTTTATCGT 
GGGCTTGTCA 
GAGAAGTTCC 
ATGTCGTTGG 
GTCCCGTCAA 
TCATTTACAG 
TCAAGCCACA 
CTTGTTTAAT 
ATGACTAGGT 
AAGAGTTCCT 
AAATCCCAGA 
CAGAAGGTCA 
CGAGGAAGCA 
TAGATGGTAA 
CCCCTGTCAT 
AATGTTTTCA 



CGATTTCTTC 
AACCAGCGAT 
GTTGACCTGA 
TAAAGTCACG 
GAACCAATTT 
GACGGAGAAG 
CGATAGCAAA 
CATCTTGGTC 
GGTCCGTTTC 
CGCCCGTATC 
CTGCGTTAAA 
CTTCTGCAAG 
AG ATTTCTG T 
GAATCAAGTG 
CTGGATAGTA 
AAAGCTCATA 
CCAGCATTTC 
TACGGATAGC 
AGTATTTCTT 
GAACCAAACT 
TTTGGCGTAC 
GTGATTGGCT 
CTGCATAAGC 
TCTTAATTTC 
TATTGCCCTC 
ACTTCTTACG 
AGGTCAAGAT 
AGGTTTTAGG 
CCTTGATAGC 
TCAACACCTC 



3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 
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CAAAAATCAA 
CAATGAACTT 
AGCTTAGATG 
CAACTTTCTT 
TTAGACCAAT 
.TTTTCTTTTT 
TAAAGGTGCC 
CCTTTAAATG 
TTTTGAAAAA 
TAAAAATCAA 
ACGCTGACGT 
TAAACAACTA 
AACCAGCTTG 
GTAGGAAGAG 
GGTGTTTCAT 
GCAATTCCTT 
CATTTTCAGG 
TTTTGGGAAT 
GTAGATTTTC 
CAACATTATT 
GAGTCAGTTT 
GGCTGTAGCC 
TGATAGAATT 
CCATGGTTTC 
TACCTGGGTC 
TTGAATAATC 
TTGGAATGGC 
TGGAGTCAAA 
GTTCAGGATC 



AAGACAAGCT 
GTCATTCTCT 
GCTCGCAGCA 
ATTATAACGT 
TCCCTACATC 
ATCCCAAATT 
TATCACCCAA 
GAATAGTATA 
AGTAGACATT 
AAAGCAAACT 
GGTTTGAAGA 
GAATAGAAAA 
ACTGATTCGT 
GTCGCTATAT 
GGTTTCAACA 
CGCTGGTAGG 
TTTCAACATA 
GACCATATTG 
ATTTTTTTCT 
CTGAATCATA 
GTAGAGCTTA 
GAGGGAATTG 
CTTATACTCC 
GTTGTAGACA 
AAAGGACTGG 
AAGCGGAACC 
AATATCGTAG 
AGTCTCGTAC 
GATATAGTCT 



CATATCACGA 
TGTTCTTATG 
CCGCCATTTC 
TTTTTTAAGC 
TCTGATTACT 
TTCATATTAC 
TATATGGACT 
GCAGTTTGGT 
TTCATTATTT 
AGGAAGCTAG 
GTATAGGCTT 
AGATAGGGCT 
CTTCTTACGT 
TTCCCTGTCC 
TCGGGATAGA 
TTTGGTGTTG 
AAGTTGATAA 
TCAAACCAAA 
AACATTTGGC 
TAGCCCTTCA 
TCCACTGTCT 
AGTCCTAGTC 
GGCTTCCAAA 
ATTCCTAAGG 
TTGAGAAACT 
AAGAGGTCTT 
GTCGTTCCAC 
TGAACTTGAA 
CCCCAGTTAT 
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AGGGCGAAAA 
CAATTGTATG 
TCTGGACTAA 
TTGCGTCAAC 
TTTTCAGGAT 
TAAACACAGC 
CAGTTGTTAG 
TAACAATCAT 
GTTGCCGCTT 
CCTCAAGCTG 
AGTATACTAC 
CTAAAAACTG 
TTATCTCCTA 
ATTTATGGTC 
AGGCCTTATC 
AATAGCCGAC 
AGGCATAGGC 
GATTGCTGGC 
TGGCTTCACC 
TCTCGTCCGC 
CTTCCAACTG 
CCAGCACCTC 
GGTCATCCCA 
TTCCCCAGAA 
CTGGTCCGAT 
CGTCCTTCAT 
CCTGCTTTAT 
TTCCTGTTTC 
AGATAACCAA 



ACCGCGGTAC 
ATTGAGTAGC 
GACAAGTGAA 
TGGAAATGAT 
ATATTTTTTC 
TACTAGAATA 
GTATTGTCGA 
AAAGGTTGGC 
TCTGTAAGGT 
TACTTGAGTA 
TAGGCAAGCA 
ACTTCTA7TC 
CTTCCGATAC 
AAATTTCTCA 
TTCCTTTGTT 
ATACTCCGCA 
TGAGTTTTGG 
CTCTGTCGGT 
AGAGAAGGTC 
AACGATAGCC 
CTGCAGATCC 
ACGCGCCCCA 
ATGCTCAGGC 
GTAAGGGATG 
ATTTTCGATT 
CTTGTTAATC 
CTTAGTGTAC 
TTCTGTAAAC 
TTTTTGACTA 



CACCTTCATT 
ATGACTTCCT 
AATCAATTCT 
CTCCGTTGAA 
TTACTGCCAT 
TTTCCAAATA 
TCCAAGCCAT 
CAGAAACTTT 
TAATACTCAA 
CGGCAAGGCA 
AATAAACAAA 
CTTAAAAACG 
ATTTTAAACT 
TAAACTTCTA 
TCCTCTGGGA 
TTTTGGAGAG 
TTTTTAACTG 
ACCACATAAC 
ACGCCGATTG 
TTGATATTTG 
TTGGAGTTGA 
TCAAAGAGCA 
GCTTCATCTA 
GAGAATTTAT 
CCTTCAATTT 
ATGTATTCAC 
ATGGCTTCGT 
TGAGTCAAGA 
TCTCGACTAT 



4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 
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TGATTTTACT ATCTAAATGA GTCGCAATTC CCCACAAGAC AAGGATAATC GCTGCAATTC 
CTGCTAAAAA TGAATAGATT TTTTTCATGC TTGCTCCTCC TTCTCACGAG AGATAAAGTA 
ATAACCTACA ACTAGGATAA TACTAAAGAG AAAGACTAGA GCAGACAGGG CATTGATTTC 
TAAGGAAATC CCCTTGCGAG CACGAGAGTA AATCTCGACT GATAGGGTTG AAAAGCCATT 
TCCTGTTACA AAGAAGGTCA CGGCAAAGTC ATCTAACGAA TAGGTGAAGG CCATGAAATA 
ACCAGTAATG ATAGACGGAG TCAGGTAAGG AAGCATGATT TCCTTGAACA TCTGAAATTG 
ACTAGCTCCC AAGTCATAGG CCGCATGAAT CATGTCGCCA TTCATTTCCT TGAGTCGAGG 
CAAGACCATC AAGACCACGA TAGGAATGGA GAAGGCCACG TGACTAGATA GAACGGTCAA 
AAAGCCAAGT GAAAACTTGA GTTGGGTAAA GAGAATCAAG AAGCTAGCAC CAATCATAAC 
GTCAGGCGCA ACCATGAGGA TATTATTGAG TGATAGAAAG GCTTCTTGGT ATTTCTTACG 
AGACTGGTAG ATGTAAATGG CACCAAAAGT CCCGATAATG GTCGCTATCA AGGCTGATAG 
GAAGGCCAAG AAAAATGTCT GAGCCAAAAT CAGCATGAGT CTCCCATCTC CAAACATGGT 
TTCAAAGTGA GTCCAGCTAA AACCTGTAAA GCTATTCATA TCATCACCAG CATTAAAGGC 
ATAGCCAATC AAGTAAAAGA TAGGCAGGTA GAGGACCAGA AAGACCAGTC CCAGATAAAG 
GTTGGCAAAT TTTTTCATCG TTCTCTCCTT TCCTTAGTCA CCCACATGGT GATGAACATG 
GTCAGGATGA GAATCACACC GATGGTTGAA CCCATACCAT AGTTGTCATT GGTTAGAAAA 
TTCTGCTCAA TAGCCGTCCC CAAGGTGATA ACGCGTTCCC ACCAATCAAA CGGGTCAGCA 
TGAAGAGACT CAAACTTGGG ATAAAGACCG ACTGAACCCC GG 
(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9223 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7542 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59; 



AAAACCAAAT TCCGGTATTT 


TAACCTATGC TGTAAATACC 


ATGAAGTCTG 


TCATGACAGA 


60 


TCAGGTCTAT AACATTAAGG 


TTGAGACAGA AAATGGAAAT 


TATGTTGGTG AAGCTAGCCA 


120 


TGTTTTGGTC CTTTTGACAA 


ATTACTTCGC TGATAAGAAA 


ATCTTTGAAG 


AAAACAAGGA 


180 


CGGCTATGCC AACATTTTGA 


TTCTGAAAGA TGCCTCTATA TTCTCCAAAT 


TATCCGTCAT 


240 


TCCTGATTTA TTAAAAGGGG 


ATGTTGTCGC AAATGATAAT 


ATCGAGTATA 


TCAAAGCGCG 


300 


TAATATTAAA ATCTCTTCAG 


ATAGTGAATT GGAGTCAGAT 


GTTGACGGAG 


ATAAATCAGA 


360 
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TAACCTACCT 
AGAGGATTAG 
TCCTTTAACA 
GAGTTTGAAA 
TGAGTGCTAT 
- TCCATATTCG 
ACCCTTTTAT 
CTGGTACCTT 
ATCTCTTTGA 
TGGAAAATTC 
ATGAGATTGA 
TTCCTGAAAG 
ATTCATCTGT 
ATCATATGAA 
AATCTACAAG 
GAGACAAGGA 
TGCTGGATGC 
AAAATGGTGA 
CTGAAAAGCT 
TGCCTAAGCT 
ATTGGATTGA 
ATCAGTTCTG 
TAGGAGAAGT 
TGAATTATCC 
ACCAGTTCAT 
TCATGTTTAA 
ATGTTCAACT 
TTTATTACGG 
TGCCTTGGGA 



GTAGAAATCA 
TATATAGAGA 
ACGGACATTC 
ACGTTTGCGT 
TTACCATAGG 
AATTCGAACT 
CTTTATGGAG 
ATTTGACCAT 
GCTCAGAGAT 
TCTAGAAAAT 
TGCCTGCAAG 
ATTTGCCAAT 
CACACCTAAG 
TTACTTGCAA 
CAATCACAAG 
GACCTTTCGG 
GGTATTTAAT 
ACAGTCTGCT 
AGTTAATAAG 
AAATACAGCC 
AGAGTTTAAT 
GAAGGATTTT 
CTGGCATACA 
TTTATCTGAT 
CGATGAAATC 
TCTCTTGGAT 
GGTTAAATCA 
AACCGAGCTA 
ACGTGTATCA 



AAGTCCTAGC 
AAGCCTTTTT 
CTTGCAAATA 
AAAATTTGAA 
CCTGAGTCGG 
AAGAAAGGGG 
GAGTTTTATC 
TGGCAGGTTG 
ACAGAAGGTC 
CTTCATGCAA 
gTTCCTGACT 
GGCAATGCTC 
AGCGATGATT 
GACTTGGGTA 
TACAATACGA 
GAACTGGTGG 
CATATTGGTT 
TATAAGGATT 
AGAGACTTAC 
AATCCAGAGG 
ATCGATGCTT 
CGTAAGGCAG 
TCTCAGCCTT 
AGTATCAAGG 
AATGGAGAGT 
TCACATGATA 
GCCTTAGCCT 
GCCTTGACTG 
AGTGACAATG 
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TCAGCGAGTA 
TAAGGCTTTT 
GTTTTACAAA 
TCAATACTTT 
AGTATGACTA 
ACATTGAAAG 
AGGATACAAA 
AAGTGTCAGT 
AAAATATTTT 
TTGGGAATGG 
GGGTTTCAAA 
TATTAAACCC 
TCTTTGGTGG 
TTACTGGACT 
CAGATTACTT 
ATCAAGCGCA 
CGCAATCTCT 
GGTTCCATAT 
CCTATCATGT 
TCAAGAATTA 
GGCGTTTGGA 
TTTTAGCTAA 
GGCTAAATGG 
ACTATTTCTT 
CTATGTATTA 
CAGAGCGAAT 
TTCTCTTTTT 
GAGGACCAGA 
ATATGCTGAA 



GAAGTATTTT 
TGTATACTTT 
AATAGTATAC 
AGGAGACAAA 
TCTTTATAAG 
CATCAACTTG 
AGAAATGGTC 
TGACTTTGCA 
GTATGGCGAT 
ATTTAAGTTG 
TACGGTATGG 
AGAAGGGACT 
TGATTTACAG 
ATATCTTTGT 
TGAAATTGAC 
TCATCGTGGC 
TCAATGGAAA 
TCAACAATTC 
TTTTGGTTTC 
TCTTTTAAAG 
TGTGGCTAAT 
AAATCCTGAT 
AGATGAGTTC 
ACGAGGAATT 
CAAGCAGCAG 
CCTGTGGACG 
ACAAAAAGGA 
TCCAGATTGT 
CTTTATGAAG 



CAAAACCGAA 
AAAAGATAGT 
TGGATTCATT 
TTGATGGAAT 
GATAAGAAAC 
CACTATGGGG 
AAGATAACTT 
CGTATCCAGT 
AAAGGGTGTG 
CCTTAGCTTC 
TATCAGATAT 
TTAGACTGGG 
GGGATTATTG 
CCCATCTTTG 
CGTCATTTTG 
ATGAAAGTCA 
AATGTCGTCA 
CCAGTGACAA 
GAGGACTATA 
GTTGCGACTT 
GAGATTGACC 
CTTTATATCC 
CATGCCGTCA 
AAGAAGACAG 
ATTTCAGAGG 
GCCAATGAAG 
ACACCGTGCA 
CGTCGTTGTA 
AGGCTGATTA 



420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
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AAATTCGGAA 


ATACGCGTCA 


GTAATCATTT 


CGCATGGCAA 


GTATAGCCTT 


CAAGAAATCA 


2160 


ACTCTGATCT 


AGTAGCTCTG 


GAATGGAAAT 


ACGAAGGACG 


GATCCTCAAA 


GCAATATTCA 


2220 


ACCAATCAAC 


AGAAGATTAT 


CTTTTAGAGA 


AAGAAGCAGT 


AGCACTAGCA 


AGCAATTGCC 


2280 


AAGAATTGGA 


TAATCAGCTT 


GTCATCTCTC 


CAGATGGATT 


TATGATTTTC 


TAAAAACTAG 


2340 


TTGATGAAGA 


TTATGGTACA 


TTTCATACCT 


TATATAGTAT 


AATAAGGCTA 


GTTACTAAAC 


2400 


TTGTAAAGGA 


GAACTTAAAT 


GAATTGTAGA 


GGACATGAAA 


CAAGACAAAG 


AATTGTTAGA 


2460 


GATTTTGAAG 


TTCAGCCTAA 


AGCACATATT 


AAGCTGTTAG 


CAAATCAACA 


AAAACATAGT 


2520 


GATGCAGGAG 


CAACTATTGA 


AGATGAATAT 


TATGTATTTA 


TCGCTGAGAG 


TAAAATTGAT 


2580 


GGCAAGAAGG 


AAGTTATTCA 


GTGTTGCATG 


GGTGCGGCAA 


GGGATTTTTT 


AGAACTAATT 


2640 


AATCACAAAG 


GGCTACCTCT 


TTTTAATCCG 


CTTGTAGGTG 


ATTCTCATGT 


AAATAATAGA 


2700 


CAAGAATATG 


ACAATACAGG 


GAGTGGAAAT 


TTATAACCTG 


AAAAGTGGAA 


TGAAACTGCA 


2760 


AAGCAGCTTT 


ATAATGCTAT 


AATGTGGTTG 


ATTATTTTAT 


GGAATGCTAA 


GCCGGATACA 


2820 


CCTTTATTTA 


ATTTTAAAGA 


CGAAGTAATT 


AAGTATAAAA 


CATATGAGCC 


TTTTGAAAGC 


2880 


AGTATAAAAA 


GAGTAAATAC 


TACTATAAAG 


AATGGTAGTA 


AAGGGAAAAC 


TCTGACTGAG 


2940 


ATGATTAATG 


GCTACAGAGC 


GGATAACGAT 


ATTAGAGATG 


AAATTTGTAA 


CTTTAATATT 


3000 


CTGAAAAATA 


AAATTCGTGA 


TATGAAAAAC 


CAACAAGGAA 


ATACAATGGA 


ATCTTACTTT 


3060 


TAGTTATTGT 


TGAATTTTGG 


GTATTCTATA 


AAATATCCTA 


ATTGAGATTT 


AAATAGTAGA 


3120 


CTATACAATA 


TAGTTAAAAT 


ATCAGTAAAA 


ACAACACTTT 


ATTGAGGTAT 


TGGATACGCT 


3180 


TTGCTAATAG 


CCTAATAATC 


ACATGTGGAG 


TGTTGCTACA 


ACGAAAAAGG 


TGATAATCCT 


3240 


TGATTTCAAG 


CTATTTTATA 


AGCATTTTGT 


CTTTCTAGAT 


AAAGGCAATT 


TTGACAATAA 


3300 


AAATCCTAAA AGGTGAATCG 


TTATAGATGT 


ATTTGTAGAT 


ATCGTTTGCG 


CATCGAAAAA 


3360 


ATTAATACAA GAATAAATAT 


TTATAGCTCT 


TTAGGTGACT 


TTTATAGAAG 


TAAAGTTTAG 


3420 


GATAGAAAAA CAAGAAATAA 


x^XJwA^^/l X X X 




ri i 1111X1 


A 1 1A1 AA I 


"5 A Qf\ 

Jaou 


GGATTTATAA 


AAATAAAGGA 


GTTTGCTATG 


ATTGGAAAGA 


ACATAAAATC 


CTTGCGTAAA 


3540 


ACACATGACT 


TAACACAACT 


CGAATTTGCA 


CGGATTGTAG 


GTATTTCACG 


AAATAGTCTG 


3600 


AGTCGTTATG AAAATGGAAC 


GAGTTCAGTC 


TCTACCGAAT 


TAATAGACAT 


CATTTGTCAG 


3660 


AAGTTTAATG 


TATCTTATGT 


CGATATTGTA 


GGAGAAGATA 


AAATGCTCAA 


TCCTGTTGAA 


3720 


GATTATGAAT 


TGACTTTAAA 


AATTGAAATT 


GTGAAAGAAA 


GAGGTGCTAA 


TCTATTATCT 


3780 


CGACTCTATC 


GTTATCAAGA 


TAGTCAGGGA 


ATTAGCATTG 


ATGATGAGTC 


TAATCCTTGG 


3840 


ATTTTAATGA 


GTGATGATCT 


ATCTGATTTG 


ATTCATACGA 


ATATCTATCT 


AGTAGAAACT 


3900 
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TTTGATGAAA TAGAGAGATA TAGTGGCTAT TTGGATGGAA TTGAACGTAT GTTAGAGATA 3960 

TCTGAAAAAC GGATGGTGGC CTAATGGAAA TCCAAGATTA TACTGATAGT GAATTCAAAC 4020 

ATGCTTTAGC AAGGAATCTT CGTTCACTGA CAAGAGGAAA AAAGTCCAGT AAGCAACCTA 4080 

TAGCGATTTT GCTTGGAGGG CAAAGTGGTG CCGGTAAGAC TACAATTCAT CGTATTAAAC 4140 

AGAAAGAATT TCAAGGAAAT ATTGTTATCA TAGATGGTGA TAGTTTTCGT TCTCAGCATC 4200 

.CACACTATTT AGAACTGCAG CAAGAATATG GCAAAGACAG TGTAGAATAT ACCAAAGATT 4260 

TTGCAGGAAA AATGGTAGAG TCTTTAGTAA CAAAATTGAG TAGTTTGAGA TACAATCTTT 4320 

TGATAGAGGG AACTTTACGA ACAGTTGATG TTCCAAAGAA AACAGCACAA CTCTTGAAAA 4380 

ATAAGGGATA TGAAGTACAA TTGGCCTTAA TTGCGACAAA GCCTGAATTG TCGTATCTAA 4440 

GTACTCTTAT CCGTTATGAA GAACTGTACA TTATCAATCC AAATCAAGCA CGCGCAACTC 4 500 

CAAAAGAACA TCATGATTTC ATTGTAAATC ATCTAGTTGA TAACACACGA AAATTGGAAG 4560 

AACTAGCTAT CTTTGAAAGA ATTCAAATTT ACCAACGAGA TAGAAGTTGT GTATATGATT 4620 

CAAAAGAAAA TACAACTTCA GCAGCAGATG TTCTTCAAGA GTTACTCTTT GGGGAGTGGA 4680 

GTCAGGTAGA GAAGGAGATG TTGCAGGTGG GGGAAAAGAG ACTTAATGAA TTACTTGAAA 4740 

AATAAACAAT TGATATTTTT AGGAGAATAG AAATGAGAGG GTTTAATAAC AAGATAAAGT 4800 

CTGTTTATCA AGAACTAACA AATTCCAAAG AGAAATTCGG TAGCTTTCAC AAGACTTTAA 4860 

TTCATTTGCA TACACCTGTT TCTTATGATT ACAAGCTATT TTCTAATTGG ACTGCAACGA 4920 

AATATAGAAA AATTACTGAA GATGAACTAT ATGATATATT TTTTGAAAAT AAGAAAATAA 4 980 

AAGTTGATAA GACAATTTTT TTTAGTAATT TTGATAAGGT TGTTTTTTCT AGTTCAAAAG 5040 

AATATATTAG TTTTCTTATG TTAGCAGAGG CAATCATAAA AAATGGAATA GAAATAGTTG 5100 

TAGTAACTGA TCATAATACT ACCAAAGGTA TTAAAAAGTT ACAAATGGCA GTCTCAATCA 5160 

TAATGAAAAA TTATCCGATT TATGATATAC ATCCTCATAT TTTACATGGA GTAGAAATTA 5220 

GTGCAGCAGA TAAATTGCAT ATTGTATGTA TATATGATTA TGAACAAGAA TCATGGGTTA 5280 

ATCAATGGTT AAGTGAAAAT ATTATAAGTG AGAAAGATGG AAGTTATCAA CATTCACTGA 5340 

CTATAATGAA GGATTTCAAT AATCAAAAAA TAGTTAACTA TATTGCTCAT TTCAATAGTT 5400 

ATGACATTTT GAAAAAAGGT TCTCACTTAT CAGGTGCATA TAAACGAAAA ATTTTTTCTA 5460 

AAGAAAATAC ACGATTTTGG AGTTTAATAT TAACTCGAAA GAATCTTCGC AACAACTTGA 5520 

TATTCTCTAT AAAGAAGTTG GTGTATTAAG TTTGGGACAA AAAGTTGTAG CCATGCTTGA 5580 

TTTTTTATTA GC AT AT AGTG ATTATTCTAA AGACTTCAGA CCATTGATTA TTGATCAGCC 5640 
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TGAAGACAAT CTAGACAATC GTTATATTTA CAGGCATTTA GTTCAGCAGT TTAGAGATGT 5700 

GAAAGCTCAA CGTCAAATTA TTTTAGCAAC ACATAATGCT ACAATTGTAA CAAATTCTAT 5760 

GACAGATCAA GTTGTTATTA TGGAGTCAGA TGGAGTTAAC GGATGGATTG AATCACAGGG 5820 

ATATGTTAGT GAAAAATATA TAAAAAATCA TATCATCAAT CAATTAGAGG GAGGAAAAGA 5880 

TTCCTTCAAG CATAAAATGT CTATATATGA GACGGCTTTA TCAGAGTAGA GTCAGAAAAA 5940 

GTAGGTTAGA AATTTAGCCT ACTTTTTTCT TTGTCCGACA GGCATAGTGT ACATCTGAGG 6000 

TCCAAGTCCT CTGTGGATAT TTGCTGCAGA TGAAACCAAT AGCGACTCCT AAGCCTGAAT 6060 

ATCGTGAGGT AGGGGGGATA GGAAGGAATT AGCGAAATCA AGGTTCTACA AACAGAATCG 6120 

TGACTTGAAG CCATATATAG CGGATGAGGA ACTCTAAAAT CCAAATAGGT GTCGTAACCT 6180 

ATATACGTAA ATTACGAGAG TAAACTAGGA AAGATGTACG GCTTATTCCG TGAGCGTTTA 6240 

GGACGTAGTA CAACGAATCA TGGGAGTCAG CTGAACACAT AGTATTGAAG AAATTTCTGT 6300 

AATGGAAATG GAGCGAAGAA GTGAACAATT AAATGAATAC CTCTCTAATT AAATTTGTCA 6360 

ATTCTAATTC CTGGTATGAA AAGACAGTGA CCTGAAAATG TAAACGATGG GAGCTGATCA 6420 

TAAATATAGG ACGGTACATG CAGTGGTGTT AGAGATTAGT CCTTACTTGA TTTGTGATAA 6480 

CTTCCCCAAA TTTCTTCTGC TATACTTTTC TCAACTTTTA AAAATCCAAC TAAGAATTTT 6540 

ACCTGGGGGT TTGGGGGCGG AGCACTAAGT TATCTTATCG TTAGCTGTCA AAACTGGTAG 6600 

GTTTTGATAG GCTGGCGATA TGATTTTTGG GATATTGTGG ACACAATATC TGAGCTCGCA 6660 

AAGCCTTACA AGAATGAAAA TCAGTTGTTG GAAAAGTGTA CTGACATTGT ATGGTAGCTC 6720 

ACATTGTCAG TACAAGTATT TTGGAAAGGA AGTAGCAGTA TGAAACGAGA TGTGCGTGAT 6780 

ATTCGGAAAC AATTTCGTTT AACAGAAGCA GAAGAAAAGC AAATTCTAGC TTTGATGAGA 6840 

GAGCGGGGAG AGACTAATTT CTCTGATTTT CTTCGTAAAA GTTTACTTTC CTCTGATTTA 6900 

CAAAAACAGA TGGAGACATG GTTTGCCCTC TGGCAATCCC AAAAACTAGA ACAAATCAGT 6960 

CGTGACGTTC ATGAAGTTTT AATCTTGGCA CAGTCAGAAC GTCAAGTCAC CCAAGAGCAT 7020 

GTATCTATTC TCTTAACGTG CGTGCAGGAA TTGATTCAAG AGGTTGCAAA CACCATACCC 7080 

CTCAGTAAAG AATTTCGTGA GAAGTACATG AGGTAAGCAC ATGGAACATC GTTACCGAAC 7140 

CAATCTCAAG AAAGTGTTTT TGTCTGATAG TGAGTTGAAC CAACTAAATA TAAATATCGA 7200 

TCAAAGTGGT TGTAAATCCT TTTCTGAATA TGCGAGACGA ACTCTACTCG ATCCTGGTAT 7260 

GAATTTTATC ACGATTGACA CAAACGGTTA CCAAGATTTA GTGTTTGAGT TAAAGAGGAT 7320 

TGGCAATAAT ATCAACCAGA TTGCTCGAAG TGTTAATCAA TCTCAGTTAA TTTCTGGTGA 7380 

AGAATTGCAG GAGTTGAAAA AAGGAATTGG TGAATTGATA AAAGAAGTTG ATAAGGAATT 7440 
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«A««au GCGCACAAGC TAAAGGAGTT CCATGGTCAT CACTAAACAC ^CCA^C 
ACGGAAAGAG .PACCGCAGA AACC^ AGTACAT7CT CAATCCTGAG AAAACCAATA 
ATCTTGCCTT GG TCTCGGAC TATGGCATGA AGAATTTTCT GGAC^c 
™**«* GATGTATCAT GAAAATTTCA TCAGCAACGA TACGC^AC GATTTTCGCC 
ACGACAGGAT GGAAGAAAAP CAACGAAAAA TACACGCTCA CCACA^ CAGTCTTTCT 
• CGCCACAGGA 

AATTAACTGG TGGCAAATTT CGTTTTATCG TTGCGACCCA 

ACAATCACAT CATTATCAAT TCAGTAGATA GCAATTCTGA CAAAAAGCTC AAGTGGGACT 
ACAAGGTGGA GCGAAATCTT CCCA^ CTCACCG^ TTCTAAAATC GCAGGTGCTA 
AAATCATTGA GAACCGCTAT ^CACCAGC GGTATGAAGT CTATCGTAAG ACTAATCACA 
AGTATGAACT CAAGCAGCGA CTCTATTTTT TTCTAGGGAC TTTGAGGATT 

TCAAAAAGAA TGCTCCGCTA CTACATGTGG AGATGGATTT CCGTCACAAG CA^CCACC, 
TTTTTATTAC GGACTCAACT ATGAAACAGG TGGTGCGTGG CAAGCAACTC AATCGCAAGC 
ACCC^ACAC AGAAGAATTT TTTAAGAACT ACTTTGCCAA AAGAGAAATA GAAAGTCTCA 
-CAA™ ATO5CTCAAA GTO3aqaata ^ ACTTCAGAAA GCAAAACTTT 
TTGGACTAAC TATCAATCCT AAACAAAAGC ATGTTrCTTT TCAATTTGCA GGAGTGGAGG 
TAAAGGAGAC AGAGCTAGAC CAGAAAAATC ^CA^ AGAGTTTTTC CAAGATTATT 
TTAAAAATAG AAAAGATTGG CAAGCTCCAG AAACTGAGGA TTTCGTTCAA CTTTATCAAG 
AAGAAAAGTT ATCCAAAGAA AAAGAACTTC CAAGCGATGA GAAGTTCTGG GAGTCCTATC 
AAGAGTTCAA GAGTAACAGA GA TO CC GTO: A.GAA^CA GGTGGAGTTG TCACTCAATC 
AAATTGAAAA AGTAGTGGAT GATGGAATTT ACGTCAAGGT CAAGTTTGGT ATTCGTCAGG 
AGGGACTTAT C^CCG AACA^CAGC AGAGGA.AAG GTGAAGGITT 

TCATCAGGGA AACCACC T CC XACA^ ACCACAAAGA CGCTGCCGAG AAAAATTGTT 
ATATGAAAGG TCGAACCTTA ATTAGACAGT TCAGCTATGA AAATCAAACC ATTCCATTAC 
CCAGAAAAGC GACAGTCGAT ATGATTAAAG AGAAGATTGC GGAAGTGGAT GC^G 
AACTGGAAGT AGAAAATCAA TC^^A CGATTAAAGA ^AG^ CATGAACTAG 
«««« ATTGAGAATC AA^C AAGAACGAAT GTCAACCTTG AATCAAGTAG 
ACTCGCTTCA GTTGAAAGTA AGCAAGAAAT GAAATTAAAT C^CAAAAC 
TGAATATAAC TGAGAATATC AGTGCTAATA 



7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
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ATCAACTGGA ATTGGAAAGG GGCAGGTATG AAAAGATGGT AGT 9223 
(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 6827 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

TCTGCTGGCT ACCATCATCT GACTTGGGCA AGACCAAAGT CTTAGTTACA ACTGTATTCT 60 

TCTCAGCATT TTCAATAACT GGCAATGCCG ACTGAAGCGT ATCTTTTTCT GTTTTTGTAG 120 

CTGGTCCAGT TTCTTTTTTC TGTCCGCAAC CAACCAGGAC AAAAAGGAAA GCTAGACTAA 180 

CAAGAACTAT TTTTTTCATT TCTTTCTTCT TTCTTTTTGA AATTAAAATA GAATAAGACT 240 

GGGAAGTGCT CCCAGCCTTG ATGTTTATAG AGCTGCACGC AAACGTGCTT CTGCATTTTC 300 

TACATTACGG ACAGAGCGTG GTAGGAAGGC ACGAATATCG TCTTCCTTGT AGCCAACTTG 360 

CAGGCGTTTT TCATCTACAA GGATTGGGCT CTTTAAAATT CTCGGTGTTT CCATAATCAG 420 

ATTGAGAACT TCATTGACAC TCAAATCTTC AATATCCACT CCAAGGGCTT TGGCATAGCG 480 

ATTTTTAGAC GAAACGATGC TGGCTATTCC GTTATCTGTT TTGGTTAGAA TATCCAGTAA 540 

TTCTTCTCTC GTAATTCCTT CTTTACCAAG GTTTTGTTCT TTATAACTTA ACTGGTGGGC 600 

ATTGAGCCAG GTTTTTGCTT TTTTACAGCT AGTACAACTT GAGACTGTAT AAATTTTAAT 660 

CATGTACCTA CCCCTTTCGC TACATGTTAC TATCAGTTTA GTCTATTATA CCATAAAAAA 720 

CATCCGACTT GCGACCTATT TTTAATTTTT TTTGACTTTT TTCGTCATTT TCGTACTTTT 780 

TTCTTGACAA ACAACTAAAT GACTATCAAC TCTTTTGGAG CTAGGGTCAA TAATTCACAA 840 

CCTGTCTCTG TAATCAGGAT ATCATCCTCG ATACGAACGC CATATTTGCC TTCGATATAG 900 

ATACCTGGTT CATCGGTCAA GGCCATACCT GTCTTAATAG TTTCTGTAGA AGTCTGACTA 960 

AAGTAGGGTT CCTCATGGAT ATCCAGACCA ATACCGTGGC CAATGCCGTG AGTAAAGTAG 1020 

TCACCATAAC CTGCCTCAAT GATAATATCA CGAGGGATTT TGTCAAAGTC ACGGAAACCT 1080 

AAGCCTGCCT TAGCTTGGTC AATCAAGGCT TGGTTAGCTT TTAGAACCGT ATTGTAAATC 1140 

TCTGCCTGCT CATCGCTAAC ATGCCCTAGA TAGATAGTCC GGGTCATATC ACTGACATAG 1200 

TGGTCATAGA GACAGCCGAA GTCCATGGTG ATGGCTTCTC CCAACTCCAC TGGTTTGTGC 1260 

ATTGGATGGG CATGGGGTTT AGAAGAATTG ATACCGCTAG CTAGGATCGT ATCAAAAGAT 1320 

AAGCCAGATG CTCCCAACTC ACGCATGCGG AAATCAAGGA AGTTGGCAAT CTCAATTTCA 1380 
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GTTTTTCCTG 
GCCTTGCGAA 
TGAGTTTGTG 
ACTGAAATCT 
ATGACAGCCA 
-GCGATGATAT 
CCGTTTGAGC 
CCATCTAGTT 
TAACTTTCCT 
AGTTCCTGTT 
ATGGTCTGCC 
GTCTACAAAG 
OGTCGGCTCA 
TAACTTCATA 
ATAGCCTAGC 
GAAAAATTCT 
GTAGTGAACT 
ATAAACATCT 
ACAGCGACCT 
TTCATTTGTC 
GTTAGACCTC 
CTCAATCCCT 
GGCAATGGCT 
TGTCACTGCG 
ACGCGCTCCT 
GATGACACGT 
CTTAGGTGTA 
AATCAGATAA 
GCCCAAGTCA 



GTTTGATAAA 
TCGCTGCAAT 
GAAGCAAGTT 
CATCTTCAAA 
ATTCATCACG 
AGCGAGAGTC 
CCCAAAAACC 
CTTTTTCTTG 
TTCAAATAGT 
ACGATGATGG 
GTCTTGATAA 
CGAGCTAAAA 
TCCAGAATGT 
CGTTGGGCTT 
CCTACATCCT 
ACCGCATCGT 
TCTAGGGTTT 
GGCAAGAAGT 
CCCTTGACGT 
TGAGCAAAAA 
GGCGTCCGTC 
GTAATAGTCT 
TTTTTGAGAA 
ATAAATTTTC 
ATCACTTCAA 
TTGCCTGACA 
CCTGCTGCAA 
TCAGCCTCAC 
CGCATCTTTT 



GTCAAGCGCA 
CTCTGCCTCA 
CAAACCTGCA 
ACCGATACGA 
ATCAGCCACA 
TGTCACTAAG 
AGTCAAATAA 
CATTTTAGCT 
GTCCTGTATA 
TTCCACCACC 
CATCCAGATT 
CCTTGAGCAG 
AGAAAGATTT 
CTCCCCCAGA 
TGATGGTCTG 
TGACCGTCAT 
CACTGTTATA 
GCATCTCAAT 
TGAAACTGAA 
GGTCACGTAT 
CGATAGGGCT 
TAAACTTACC 
TGCTGTTGAT 
CTAGTGGAAA 
TAAAACGACC 
AGTACTGACC 
CAATCTCACC 
GCATGGTATC 
TCAGACTGGC 



521 
TCGCGGAAAG 
TCCTTAATCA 
AAAGCTGCCT 
GTCAAGCGCA 
ATCTCAAAAC 
ACCTGACGGT 
TAGACGTTTT 
AGAAATGCTT 
GCTGGCTTCG 
GACACCGCCC 
GTGCTCGATG 
GCGAGCAATG 
TCCTGTCGAT 
AAGGGTGGTA 
GAGTTTGCGT 
ATCCAAGACC 
GCGGGTTCCG 
CTTGATAATC 
GCGCCCCTTC 
ATCGTCAAAA 
CTGGTCAATA 
AGGTTTGTCT 
TAGAGTCGAT 
GCGAGCCGTG 
ATTTCCGACA 
TGTGATAGAC 
ACCAAAAACA 
TTCGTCGTGT 
AATCAGGCGA 



CTTGGTCTGA 
TACGAAGACC 
GCATACGGTG 
TGTCCTTAAC 
CACTGGTTTC 
CACGACTGAT 
TAAGATTGTT 
GTACGCGTTT 
TTGGCAGCTA 
TCAGGTCCCA 
ACGAGGACTG 
TCCTCTGTAT 
CGTTTGTGGA 
GCTGGCTGTC 
TGAATTTTCG 
TGCGAAATAT 
TGGCAAACTT 
CCGTCACCTG 
TTGTAGCCTC 
ACTCCTGTAT 
TCAATCAAAC 
GAATTACGGT 
TTCCCTGAAC 
ACATTTTGCA 
CGGCGCTCTT 
TTGCTGTTGC 
CCGGCACCAG 
TCCACCACGA 
TCATTGTCCC 



GATAGAACAA 
TTCCACAAAC 
GTAATAAGAC 
AATTCCTGCA 
TTGCTTAGCT 
AAAGACTGTT 
GATGATGATA 
ATTCATGATG 
CTTCTTCTGG 
AGTCAATGAT 
TATTGCCATC 
GAAGCCCTGT 
GTTCGCTAGC 
CCAAGGTCAC 
GAATGTGTTG 
TCTTTTCCTT 
CACAAGCCAC 
AGCAAGCTTC 
GAATCTTGGC 
AGGTAGCTGG 
GGTCGACATG 
TGAGCTTCTG 
CCGACACACC 
AGTTGTTCTC 
CTGGTACTGG 
GAGCCACTTG 
GACCAACGTC 
TAAGAGTATT 
TCTGGTGAAG 



1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 
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ACCGATTGAC 
TGCCAAACGA 
TAGATAGTTA 
GGGACGAGCA 
GTCAGCGATA 
GACAGACAAG 
CATGTAGAGA 
GATATTATTG 
CTCATAGTGG 
TTCTGACAGG 
TAACATGTTT 
TAAGGTTTTG 
GCCGTCACAC 
CTCTGGGACA 
GTCGTCCATG 
ATCAAAGAGA 
ATTGTGTTGC 
CACACGGACA 
TTTTTTCTTG 
AACCTTATCA 
GTAAGGCGTC 
TCCCACCGTC 
GAGACCATCA 
GGCGGACAAA 
ACTGGACTTC 
CACATCAATA 
CTTTGTTCTT 
CCCAAAAGGC 
GGATAAGGGA 
CGCTTGAACC 



GGCTCGTCTA 
ATGCGCTGAC 
AGACCCACAT 
ATGATGGCTT 
GACAGGTCTG 
GCCTGGTCAT 
CGCATCTGAG 
ATAACTCCCT 
AAATGGAATT 
TCCTCAAAAG 
GGATAGTAGT 
CTAGCATCTG 
TCACTACAAG 
GTAAAACCAC 
GTGTCGATAA 
CGACTACGAA 
TTGCTCTTAG 
CGAACATACC 
CGGATGACAG 
ACGATTTGCT 
CCCACACGTG 
GAGCGAGGAT 
ATGGCATCTA 
CTCTCTACAT 
CCTGAACCTG 
TTTTTTAAAT 
TCTAGTCCAT 
CGATTTTGTA 
TATGAAACAA 
GGGTACTTGG 



GGATATAGAG 
TTTCCCCACC 
TATTAAGGAA 
CATTTTCAGA 
AGATTTCTCC 
TGAGACGATA 
TGCGAGTGTA 
CAAACGGAAT 
CCTTACCATC 
GCTTATCCAT 
TGGATGAGAT 
GCACTACCAA 
AGCCAAAAGG 
AAACTGGACA 
TGACATAACC 
TGCCCTCCTT 
ACAACTCTGG 
CGTCTTTCTG 
GAGCCAAGAT 
CCACAGAAGA 
CGTAGAGGAG 
TTTTACTAGT 
CATCTGGTTT 
AGCGACGTTG 
ACAAGCCAGT 
TATGGGCACG 
TATTGCTTAC 
GTATAATAGT 
GTTTTTCTCT 
ATCAATCTCG 



522 
GACACCTGAT 



TGAAAGGGTT 
GGTCAAACGA 
CAAAGTTAAC 
AATATGTGGC 
GCCTTGACAG 
ATCGCTATTG 
GTCGATATCG 
TGACCCATAG 
AGCCACTCCA 
AGGATTCCAA 
ATCAGTATCC 
AGCATTGAAA 
GGCATAATGC 
TTCTGCAATA 
GATAACAATA 
CACTTCGGTC 
AACCTTCTCG 
CTGCAAGCGC 
AGCATTGATA 
ACGCAGATAG 
CGTTTTCTGG 
TTCCATATTT 
TCCCTCCGCA 
CACGACAACC 
CGCCCCATGA 
CATTATACCA 
ACAGTGTGAA 
CTACAACAAC 
TCAATCCGAC 



AGGTTGGAAC 
CCTGCTGAAC 
TCCTTGATTT 
TGGCTCACCA 
CCTTGCTGGC 
GTTCCGCAGG 
GTTTCATGGT 
CGCACGCCAC 
AGAATCAAGT 
AAGACTTTCA 
GGTGCTAGCG 
ACCTCCACCT 
GAAAAGAGAC 
TCAGAGAACA 
CGAAGGGCAG 
CGGTCAACCA 
ACATCATAGA 
ATAACACTCT 
TGGCGTTCAG 
GCTCCATGTC 
TCATTGATTT 
TCGATGGAAA 
CCCAAGAACT 
TAGAGAGTAT 
AACTTGTCTC 
ATGACAATTT 
AAAAAAGTGA 
AAAATCTGAA 
TGAATTTAAA 
TCAAAATGAA 



CAATCTGGGT 
GTGACAGGGT 
CCTTGAGAAT 
AGTCCAAGTG 
CGCCCACACG 
TCAGCTCATT 
AACGACGTTT 
CAAATTCATT 
TCTTATCTTC 
TGGCCTGCTC 
CTCCCTCACG 
TGATGCCCAA 
GAGGCTCTAA 
ACAACTCCGA 
CCTCAATGGA 
CGACATCGAT 
CTTCCCCATC 
TATGTTGGCC 
GTAACTCCAA 
CGTTGATACA 
CAGTCGTCGT 
TAGCTGGGCT 
GACGAGCGTA 
CAAAAGCCAG 
GCGGAATCTC 
TATCTTGCAT 
GATTCTATTA 
AAATGAGAAA 
GAGATCGATA 
TCACTCGAAA 



3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 ' 

4920 
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TCGCCAACAC 
TGTCTCGTAT 
AGGAAAGAAA 
AAACCATTAT 
GATTGCGTAA 
- ATGCAGAGCT 
GTCAACTGCA 
CTATCGTCTA 
GTTCAACCAG 
TTGAAACCCA 
CAGAGACCTT 
TTGTGACCAT 
AGGATAATGA 
CCTTTGCTAT 
AGTTCCTATG 
CATTGCTACC 
TGAGGAAGTC 
ATACGGCGCA 
GCATCCAAAA 
TAGCCTCTTT 
CTATGGATTG 
GTACTACTTT 
CTGGAAATCT 
AACAAGCTTC 
TGGAGCAGCC 
AAGTGCAGGA 
TTGCTTTTTC 
GCTAGCTGCA 
AAGAGATTTT 



CTTCGATATT 
TACCATTGAA 
TAACCGCACC 
CACTACGTGT 
TTTCTATACC 
TTACCTAACA 
TCAATCAACT 
TTTCAAGGCC 
CAATATCAAG 
ACAGGCCATC 
TGCCTCTATC 
CGTCATGTCC 
AATCCCCCTA 
GAGTGTCTCG 
TCTCAAATTG 
CAACAATTCA 
ATTCCCCAAA 
CCAACTCATT 
GAAAATGATG 
GCCCTTGTCA 
ATTACTCTTC 
GTTTACCAAT 
GTACTAGTTA 
CTACCAGCTA 
CTCCTAGCCC 
CCAACACGCT 
ACTTACTTTT 
GGTTGCTCAA 
CGAAGAGTAT 



GATATTGCTG 
GACGAGTATA 
TACTACGTAA 
TTGGAACCAC 
TTCATGCGTT 
GCCCTTOGTT 
CGTAATGAAG 
TCCCTCAAAA 
AAATACCTTG 
GAGATGGCAG 
ATTTCTAACA 
ATCCCAACCA 
AACGGAGAGC 
CTCACTCTCT 
ATCTACAAAA 
TCAAAGATGG 
TCCTTGAGGA 
GGGCTCATAG 
ACCCAAAACT 
GCGCCCTCAC 
TATTAGTTGG 
ACTATGGACC 
TCCTAGCTTC 
GCCTTAACCC 
TTCGCTTCTA 
ATCAAGAATA 
TTGAGTTATA 
AGCACAGCTT 
TAAAAGTATT 



523 
ACCTTCGAGC 
CCCTGATTAT 
CCATCCCGCT 
TACCTGTCCT 
CACGTTTTAT 
CAATCGACCG 
AATTGATTGA 
CAAATGAGCG 
AGGACGAAGA 
ATATTTATGG 
ACCAGAACAA 
TGGTCTTTTC 
CAAATGCCTT 
ATCTCATCCA 
ATtAACTAAG 
GAAAACAGAC 
GCAATCTAAA 
CTTCACTGTC 
GATGATTATG 
AACCTTCTTT 
ACTGGTTGGT 
AGATATGGAT 
TATGTTCCTT 
AGTACTGGAT 
TCTCAAGAAA 
AGAAAACGAT 
TTCAATGAAA 
TGAGGTTGCA 
CTTCTGAAAT 



ACCGCTCGAT 
CGTAGACGTG 
TGGTATTATC 
TGATGTCTTT 
CTTTCAAATT 
CAAGAGTGAA 
GCTCATGGAA 
CGTGATTAAG 
CCTGCTTGAA 
AAACGTCTTG 
CATCATGAAA 
TGCCTACGGG 
CTGGTTAATC 
TAAAAAATGG 
AAAAACCAAG 
GCTGAAATCC 
GGTACAACTG 
AAAGAGCAGT 
GACTCAGCTC 
GCGGCAGACC 
GGATTTGCCT 
CGCAGTCAAC 
TGGTTGCTTG 
CCATTGCCAC 
CGCTTGAATA 
AAAAGCAACT 
ATCAAAGAGC 
GATAAAACTG 
CCCACATAGC 



GCGGAAGAAA 
CCGGTCACGG 
ATCACTGAGG 
ATCAACCGTC 
CTTTATCGCA 
CAAATCGAAA 
TTGGAAAAAA 
AAATTGACCA 
GACACCCTGA 
CATTCTATGA 
ACCTTGGCCC 
ATGAACTTTA 
GTCTTTATCG 
TTCTAAGAGG 
AGTTTGTCCA 
AGACTATTTT 
CCCGTTCCCT 
ACGAAAAAGA 
TTTTCATCAC 
AAGCTTTCGG 
TCTACTTGAT 
GTCCACCTTT 
TCTTCTTTGC 
TAGCTATTAT 
TCCGTAGTGC 
GCAGGTGCGG 
AAACTAGGAA 
ACGTGGTTTG 
TTTCTCTTAT 



4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 
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(2) INFORMATION FOR SE Q ID N0; 61; 
U) SEQUENCE CHARACTERISTICS • 

° jipe: nucleic acid 
(C STRANDEDNESS: double 
CO) TOPOLOGY : linear 



(Xi) SEQUENCE DESCRIPTION : SEQ I0 N0 . 61 . 

— „ GCATAGAGCA AAGTTGCTTC TTCATCAACA AAACCGTTCA TTTCAAAATA 
~C TCATCAGGAC _ G ^ 

~G — C CAGGCCCTGT CTCTGATGGT C^ 

TTCTTCTAAA ATATGGGAAA ^ «« 

™A " ™ ~ A T ~ 
AACATf ACTTGCTTGC CTGAATGATC TTAATTGGAA TTTCCATGGG 

— - ~ ~ ~ ~ — - 

====== 

GACAATTTTC TCAGTVa OGCAAAGTCA TCGATAATCA CTGTATCATT 

k-haitttC TCAGTGAAAC GACGTTTAAr «wr. 

c , ar ^ 0 CGTTTAAC ACCGGCAAAT GTTTTCAAGT GCTCACGCAC 

====== 

====== 

====== 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
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GTATTCTGGG TGGTAAGGCA TGAAGTGACG CTCATATTCG TCAGATTCAA AGACAAAATA 1320 

TTTGGCATTG GCCGAACCAC GACCTGTCCC ATCTCCAATC AAGAAGCTGG TATCTGTAAT 1380 

GTGAGACAAG ACATGAGACA ACATACCTGT CGTTGAAGTT TTTCCATGTG CTCCTGCTAC 1440 

TCCCATGCTA ACAAAGTCAC GCATAAAGCT ACCTAGAAAC TCATGGTAAC GTTTGTAGCT 1500 

GATACCATTT TGGTCCGCAT AGGCAATTTC GACGTTGTTA TCTGGACGAA AGGCATTTCC 1560 

AGCGATAATT TCCATATCAC CGTCTAGATT TTTTTCATCA AAAGGAAGAA TGGTAATTCC 1620 

TGCCTGCTCA AGACCGCGTT GGGTAAAGTA GTACTTTTCA ACATCTGATC CCTGAACCTT 1680 

GTGCCCCATC TGGTGCAACA TCAAGGCCAA GGCACTCATC CCTGATCCCT TAATTCCGAT 1740 

AAAATGATAT GTCTTTGACA TGTTTTCTCC CCTATTCTGT CATTCTGGTC AGATTCAACT 1800 

CTTGGGCAAC CCGACGTTCT TGTTCTGTTT GTTTACTTTT TTTATTGTAG ATTTGGCTCT 1860 

TCTTTAGAAA ATCATAATTG TTTTTCTTTG GAGCAGGTGC TGACACTTCT TCATTCTTGG 1920 

TAGGGATAGA ATGAACTTCT TCCGCCAAGA TATAATGAGA CTGGGTCAAT TTTTGGCTAT 1980 

ATTTGACAAA TTCACCAGGA TTTTCCTTTT GGAAAGGAGC TGTCGGTTGA TTGCCCTGTC 2040 

TAACTAGACT GGGCTGAGAA TGACGTCTCG CAAGGCTGAA ATCCTGAGTT AGGTAGTTAG 2100 

CAGAGCGTTT CTTTTTCAAG TCCGCACGCG CTTCTTCACG CGCCACCTCC GCATAGCTCT 2160 

TTCCTTCTTT TTTAACCCCT AAAGGAGCCT TTTTAGGTTT TTCGACTTGC TTTTCAATCG 2220 

GTTTTACTGG TTTTTCTTCA GCAATAGGAG CCCATTCTAA ATAATTTTTA TCTCGATACT 2280 

CACCCTTGAT ATTACTGATC AGATCAGACT CATCATAGAG ATTCATGACT GGCATTTCAG 2340 

TCAACATGAC CTCGTCATCT GACACCAATG GAAATCGTTC TTGTTTCATT TTCTATTTCC 2400 

TTTCAACACT TCATTATAGC GTATTGTCTT GATTTTTCAA GTGCTGGCTT CAGAAATTCC 2460 

CAAAATTTCT CTAATTTCTG CTAGGGTCAG ACTACCACGT GACTCTGTGC CGTCCAATAC 2520 

TTGTGACACC AGATGTTTCT TTTGTTCTTG GAGTTCCTGA ATTTTTTCTT CAATGGTTCC 2580 

CTTGGTCACC AAGCGATAGA CCTCAACCGT TTCTTCCTGA CCCATCCGAT GGGCACGGCC 2640 

AATGGCTTGC GCTTCCACCG CAGGATTCCA CCAAAGGTCA ACCAAGATCA CTGTATCTGC 2700 

ACCTGTCAGG TTCAGACCGA CCCCACCAGC CTTGAGGGAA ATCAGAAAGG CATCTCTTTC 2760 

TCCTTGGTTA AAGGCCTTGG TCATGTCTTG TCTTTCCTTG GCTGGGGTTG AACCCGTAAT 2820 

TTTAAAGGAA GTCAGGCCCA AGTCTGGCAG TTCTTGTTCA ATTTTTTCCA ACATTCCCTT 2880 

GAACTGAGAG AAAATCAAGA CACGGTGTCC GCCGTCTGCC ACCTGTACCA GTAGGTCTCG 2940 

GAGACTATCT AGTTTGCCGC TGGCTCCCTG ATAATCTTCC ATAAACAGGG CAGGAGTGTC 3000 
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ACATATTTGA 
CTGTTCTGAC 
CTTTTGCTGG 
CAGAACTTCT 
TGCTGGCAAT 
AATAGACCAC 
CGGCACCACA 
AGCCTCATCT 
GAAGGTGGCA 
TGCTTTCAAA 
TGCCCAGTTG 
TTGACTAGTC 
AATCCCACCA 
ATCTCGCAAG 
ATCCTGGGCC 
ATGAGCTAAA 
TTGCCCCAGT 
TTGATTAGAC 
GATTTCCTCC 
GGCAATCTGC 
AACATGCCCG 
CGCTTTTAAG 
TTGGAAAATT 
TCGGCTAGAT 
TTCTTCCCTG 
TGAAACGGTG 
GCTATCAAAT 
GAGAAGACGC 
TCCCTGATAA 
GTATTCTGTT 



CGCAAGCGCA 
ACTTGAGCCA 
TCTTCCAGTT 
TCTTTCTTGC 
TTCATAAATT 
AACTCACCCA 
AATTGTCTCA 
AAGAAAAGGA 
TAGCTAGTCA 
CCATGAACAA 
TAAATCAAAC 
AAAAAAGCAA 
AAACCATAAT 
TCAGCCTTGA 
AGATTCTGGA 
CTGTAGGCCA 
TCCTGTAGAT 
GAATCAATAT 
TGGACAATAT 
ACCTGAGGAC 
AGTTTTTCAA 
GCCTGACGCC 
CTTTCTTCTA 
ACCTGTCTAT 
TCAAAGTAAA 
CCCATCTGGA 
TGCAGGTATT 
ATCTGCTGGT 
AAGACATTGA 
ACTGTAAAGG 



TCAAACCAGA 
GATGGTCTCG 
CATTTTTATA 
GTCGCATCAC 
CTTTCTTGCT 
GATGGTTTTC 
AGGTCTGGGC 
AGTCAAAGGC 
CATAGATTTG 
CAGTCACATC 
CCGACGGAGC 
TGGTCTGAAG 
GATGGAGCAT 
CCTGAGTTGC 
ATTCTTGTGA 
AGGATTTCCG 
TTTGGCGAAT 
AAAAATCCTG 
TTTGAAAATC 
TCGCTAGGCT 
AGACTGGAAT 
AAGATTGAAA 
AGTCTGCGTC 
TTCCATAATC 
AAGAGGGCGC 
TAAAAAGAGT 
TCTTTCCTTG 
CTGTTAAAAA 
CCCTAGGACT 
CAAATAGATT 



526 
TAAAATTTCC 



CATCTGTTGT 
AACCACCTCA 
GAAAGGCTTG 
TGGCAAAAGT 
AATCGGAGTT 
AATCTTGGTC 
CATCCCTTGA 
ATGGCTCTCG 
CAACTGTGGA 
GAGAATCAAA 
GGTTTTCCCA 
CTGCAACCAG 
TTGCAAAGGA 
AAAAGAAACA 
AGCCTGCAAG 
TTTCTTGGTT 
ATTGGCAACC 
AAACTGGATT 
ATAAAGCTCT 
GATATCATGA 
ATCGGCCTCA 
ACTTGAAAAG 
AAACTGAATT 
AAAAGTTTTG 
CAGACAGGAG 
TTGACCCACA 
ATAAACCTGA 
CTCACTGATT 
GGCATCAGCA 



ACACGACTTC 
AACTGGGCAA 
ATCAAGTCTG 
ATAAACTGAG 
CCAGGCATGA 
CCTGACAAGG 
TGGGCATTTT 
TAAAACTCAC 
GCAAGAATCT 
GCAAATTTCT 
ACCCGACTTT 
AGTCCCATAT 
CCAATTCCCT 
AAGTCCTCTG 
CGGTCTCGCC 
GTCCCATCTT 
TCTTCATCGA 
AAGGCCTGCA 
TCCAAGAGAC 
TCTAGTTTAT 
AAAAAATGAT 
AAGCCCGCAG 
GGTAATTCTT 
TCTAAACGAA 
ATTTGTAGAC 
GCCAATTTGT 
GGTAACGCTT 
CCTTTATGGA 
TCCATTTCAA 
TGCATATCCT 



GCTGAAATTC 
GGTAAATAGC 
GCAATTCAGT 
CCACTCGCTC 
CGATTTGGAA 
CAAAGACCGA 
TCATGACCTG 
TGTCCTGACG 
CCTCACGACT 
GAAACTCATC 
CTTTTGTCAC 
CATCAGCCAA 
TTTCCTGATA 
GATGCGTCAA 
CTTCAAAGAG 
TTAATTCAAA 
AAAAGTAAAC 
TGGCTTGGTC 
CTCCCTTGGA 
CTGATAGGTC 
AGACAGACTC 
CCAAACAGAC 
CTAGCTCTTG 
TCCGATTATC 
GTTCTGGAGC 
CTCGATCACT 
TAATTTCCTT 
AAAGTACTGC 
AATAATCCGA 
GAAAAAGCAG 



3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 
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GGTTTGGTAG CTATCCACTT GATGGTCAAA TTGAAAATGG GGCAAGGCCA TCAGTAAATT 4860 

CACACCCTGC TCAAAAAAGG TCAGAGGGAA AAAGAGGTGC CGACCTTGGT TTTGGAAAAA 4920 

GAGGTCTGGA ACCAGCCCTT CCTCCGTTAG TCCGTGCAAG AAAGTCAAAA GTTCTTGGCT 4980 

GGCATCATCA AAGGCTTCCC AAGAAAGAGA CTCCTCATAA ATCTTGCCAA TCATATACGA 5040 

CTTTCTCTGC TCGACAATCC TTAAAAAAAG TGGAATATCC CGAATGACAT AGTATTTTTG 5100 

GCTATTGATT TGGCCGATTC TCAGAGTCCA CAAGATATGA TTGGTTCCTG CTTCCACCTG 5160 

ACCCACAGCT GATAACTCAT AGGCGCATTC TGATTTTGGA GATAAAATTC GATCCAAAAA 5220 

CTTGCCACCC AAGGTCACCT TGGTTTCAAC AGCCTCTTTT TCTTCATGAC CTTCTTCCAG 5280 

ACTCCACAAG ATTTCCTGAC CACGCTCATC ATTTTTCAGA AAATGCTCTA GCGCTGCCAA 5340 

ATGCACACAG TAGCCCCTCT TTTGAAAAAA ATCACAGGCA CAAAAAACCA AATCATCCTC 5400 

TAAACTATAG CGCAGTTCTT CTTCTGCAAC GCGAGCGTAG AGCCGATTGT TCTTTTCCTT 5460 

GATGATATCA ACCTTACCAG TTTCATAAAG GGCAACACCT TCGATACGAA TTTTCCCCGG 5520 

AATCAATTTA GCCATATTTT CACCTTTACC TTATCTTTTT ATTATACCAT ATTTTCGCCT 5580 

ATGAAAATAG CCTTCTAGGA AGACTTTTCT CCTAGAAGGC TGGATTTTTA ACGTTTGGCA 5640 

AAAGTAGCCA CAATCCGCTG ACAGACTTCT TGCAACAGAG ATTTGGGCAT AGCTATATTG 5700 

ATGCGGGCAT GGAGACTTCC TTCCTCTCCA AAATCCAAAC CACGGTTGAG GATAACCTTG 5760 

GCTTCATTTC TCAACAACTC TTGCAATGTT TCATCAGTCA GGTCATAAGC TGAAAAGTCA 5820 

AGCCAAATCA AGTAGGTACC TTGCGGTTTC ATGACCTTGA TTTTAGTCTC TTTTCCAAAT " 5880 

AGATCCATCA CATAATTGAT GTGGTCTTCA AAGACTTGCT TGAGTTCCTC TAGCCAATCT 5940 

TTACCGTATC GATAGGCAGC TTCTGTCGCC AAATAACCCA AGCCTGAAAT TTCATGCTGA 6000 

TTATTGGCCA ACAGGCGTTT CTGGAAAGCC AGTCTCAACT TAGGATTTTC AATGACTGCA 6060 

TAGGAATTTT TTGTTCCAGC AATATTAAAT GTTTTAGTGG CACTGCTCAA GACGATAGCA 6120 

AAATTTTTGA AGGCAGGATT GATGGTATTG AAAGACTGGT GTTTGTGACC AAAGAGGGTC 6180 

AAATCTTGGT GAATCTCATC CGAAACTAAC AAAACACCGT GTTTTTGGCA GAGTTGGCCA 6240 

ATCTTCTCCA ACACTTCTTT TTCCCAAACA CGTCCACCAG GATTGTGAGG GTTGCAAAGA 6300 

ACATAGAGTT TAACCTCCTC TTCCACCAAA TCCTTTTCAA GTTGGTCAAA GTCAATCTCA 6360 

AACAGACTAT CCTTTTCCAC TAAGGAATTA GTAATCAATC TACGATTATT CAACTTGACA 6420 

CTGCGAGCAA AGGGTGGGTA GACAGGCGTG TTAATTAAAA CCGCCTCGCC TTCTTTTGTA 6480 

AAGGTTTGAA TAGCTGTTGA GATGGCTGGT ACCACACCCT CGATAAAGAC AAGAGCCTCT 6540 
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TTGTCAAAGT 


TGTAACCGTA 


TTGTGTAGCT 


TCCCACTTTT 


GAACTTCCTT 


AATTAAGTPT 


6600 


TCACTGGCAT 


AGGTATAACC 


ATAAACCAGT 


TGGTCTGCGT 


AAGTTTGCAC 


GGCTTGG PGG 


6660 


ATTTCAGGCA 


AGACCACAAA 


GTCCATATCC 


GCTATCCAAG 


CTGGTAGAAC 


TTCAPTATPP 


o / zu 


GTTTCTGTTT 


CTTTCCATTT 


ATAGGTATGG 


TGCCCTAAAC 


GGTTGGGPAG 


fiPTTnTa AAA 
U^. 1 lu 1 AAAA 


D / OU 


TCATATTTTC 


CCATCTTTGT 


CTTATCCTTC 


X n X u\JV 4 X VJw 


PGPAAATPTH 


PA ATP A A ATP 


fLQ Af\ 


TCTAGCATCC 


TCAATCCCAA 


TAGACAAACG 


CAAGAGGTPA 


TPTPJTPAAAP 
1 v- 1 Vj 1 V-AAAe 


PAT A ft fill ATP 
l_A l AAuAA 1 Vj 


oy uu 


GCGTACCTCT 


GCTGGAATAT 


CAGCATGAGT 


•FTGAGTPGTT 


GG ATA AGT A A 


TA AP.APTTTP 


CO Cf\ 

oy ou 


CACTCCACCC 


AAACTTTCCG 


CAAAAGAGAA 


GAPPTTGAGA 


PTGTTPAAAA 


TATPAPJPR IT 




GCGTGTTTCA 


TCGGCTACTT 


TAAAGGAAAT 


PATGPPTPPA 


PGAPPAP.TPT 


A P A P. A A PTT^P 


7080 


CTTAACTGCT 


GGAGAATCCT 


TCAAAAAGGP. 


AAPPAPTTPT 


I *jW3\Jt—V» 1 1 Avj 


PTPTTY"* X PPP 


/140 


CTCCATACGA 


AGAGACAAGG 




APGAACCAAP 


TP OfT* A P-PTP. 1* 
Ivjvj A AV»*- 1 vj X 


P Al ATVP ftp a 
tAAA XTjtOAuA 


7200 


CAAGACTGCC 


CCTGTTGTAT 




AAA AA P. PTTP 


J V_Vj 1 A L AU 1 1 


1 AaACTATT 


7260 


GGTCACAACC 


ACTCCAGCCA 


AGAPATPATT 


ptpp.pptp.pt 


Avj A 1 At_ i 1 VjO 


fT*fl"*P PflV"* ft ft Tl/"> 


7320 


GAGAACGATA 


TCTGCTCCAT 


CTTCAATCGG 


& pptthpt a^: 


A 1 ALnjvvi. 1A1 


AvaAAGoTATT 


7380 


GTCCACCACC 


ACTTTGGCAC 


CCTTAGCATG 


AuLuinl 111 


PPT a. ptvpTiTT 
uV_ 1 ftlj Hill 


PP 11T ft*PPH ft ft 

l_ WlT ATC AAA 


7440 


TTCCAACATC 


AAGGGATTGG 


TTGGGGTTTC 


GATATAP.Af!A 


AP ATPr 1 & P BT 


rPTTTTPT' ft ft 

ill H_ 1 AA 


7 500 


CTCGGCAATC 


AACTCTTCTT 


CTGTATTGGC 




J VjVj AAA 1 uAL 


t_ X I I UCTC 


7560 


CACTTGGTTA 


AACCAGCGAA 


AAGAACCACC 


p/p& a apatp a 


\.\3\-f\\, 1 Vj^_k_A 


AuALL J. 1 At- 1 


7620 


TCCTACTGGA 


AAGACGCTAA 


AGGCCAGTAC 


AAT AP.PTH AP 


A 1 V-V_\_ 1 \jAv_rt_ 


T A PTPPPrp ft p 
1 Au IvIjL 1 ALi 


/ boU 


GGCATAGTCT 


GCTGACTCAA 


TAGCCGCCAA 


GAPTTPPTPA 


GPPTTAPTAP 


p A niw P ft *T T 


/ fQu 


TTTAGTGCGC 


GTATAGTCAA ACCCAGTAGA 


TPGAPPAAAP 
i v*V7AVr\> AAAV. 


TPTPP. a tppt 


PATAPPTPPT 


/oUU 


TGAAAAATGA 


AGTGGTGTCA CCAAAGCACC 


TGTTGCCTCA 


TCAGACTTGA 


TPPPTGPTTP 




TGCTAAAATT 


GTGTTAATGT 


GTAATTCCTT 


GCTCATACAA 


TTCCTCCAAA 


TCTATAGTAA 


7920 


CTATTGTACC 


ACTTATTTTG 


TATCCTTCGT 


TTTCTTGTTT 


TCAAGAGCTA 


GTTATAGTTT 


7980 


CAAACTATAT 


AAAAAGGGAG 


TTTTTCCTGC 


TCCCTTTAAT 


AGACTATAAA 


ATGGTGAATC 


8040 


TCAAAAGACA 


CCTTCACTCT 


ATCATTTGCT 


CCTGCACAAA 


ACGAGCATAA 


CGCTCATGAT 


8100 


TTTCCAGTAG 


TTCCTTATGA 


GTTCCTGAGC 


CAGTGATTTT 


CCCCTCCTCT 


AAGAAGAAAA 


8160 


TACAATCCAC 


ATCTTTTACC 


GTTGACAAAC 


GATGCGCTAT 


AATCACAACC 


GTCTTCTCCT 


8220 


TTAGTACAGA 


ATAGAGGCTA 


CTGATAATCG 


CATACTCAGA 


ATCCGCATCA 


AGATTAGCAG 


8280 


TGGCTTCATC 


AAATATAAGA 


ATTTCAGCAT 


CTTTTAAGTA 


GGCTCTAGCT 


ATTTGAAGTC 


8340 
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TTTCGTTCGC CCCCCTGACA AGAGTCGTCC GCGTTCACCA ACTTCAGTAT CTAGTCCCTC 8400 

TTTCATGGAG CGAATCTCAT CACCTAGTGA TACTAAGTCT AGCACTTTCA TCAATTCATC 8460 

ATCAGTTACT AAGCGATTCA AACCGAGACA AAGATTGTCA CGAATACTGC CAGATAAGAC 8520 

TGCATTATTT TGTGAAACCC AAGCGATTTT ACTTCTCCAT TCTTTTAAGT TAAAATCATA 8580 

TATACTTGAT TGCTCCATTA GAATATCTCC TGAAAGCGGT TTATAAAACC GCTCTAACAA 8640 

ACGCACAATC GTTGATTTTC CTGATCCAGA TGGTCCAACA AAAGCAATTT TTTGCCCCTT 8700 

GAAAATTGAA CAAGTAATAT CCTTTAAGAC AGGTCGATTT TCATCATAAC CAAAATAGAC 8760 

ATGGTTAAAA TTCAACCCTC GTCCTGATAC CGATTTTCCT CCCTCAAATT TTTCTTTAGG 8820 

AACTGCAAGC AAGTTCTCCA GTGCAACTGA AGATCCCTTG CTCCTAGAAT AAACAGTTAC 8880 

AAAATTAGCT ATATTACTAA TAGGATTAAG TAATTGAAAG AGGTAAATCA AAAACGAAAC 8940 

CAAGGTTCCC ACAGATATAT ATCCTGCGCT GACCCGATAA CCCCCATAGG TTAGCATCAC 9000 

AGCTATAGTC GCAAAGATAA ATAAGAGAGC AAACGGGGTC TCAAAAGAAG TAACCCTATC 9060 

TGATTTCAGT GAATTGTTTT GTACCCTTTC AATACAATTA TCCAAAACAT CCTGTACACT 9120 

TTTCTCTGCT TGGTTAGTCT TAATTAATTC ATGTTCTTGA ATCTTTTCAG TCAATTGCCC 9180 

TGTTAAATTT CCTCCTGTAA ACGACGACTA TACTTTTCAC TGATATTGGA AAGGGGCAAG 9240 

ATAATAAACA TCATACAAGG AAGAGTGATG AATAAAAGTA GAGAAAGATT CCAATCAAGA 9300 

CTAAATAAGA CTACAATGGA ACCAAGTACC ATAACTAAAC TCAGAATAAT ATTTGGGAAA 9360 

GTCGTAATTA AAAACTCACG AATGACACTC GTGTCATTGA CAATGGCAGA AGTCAACTCC 9420 

CCACTTTGGC TCTTATCAAA GAAGGATTTC TCTACATAAA TCAACCCCTC TATCACTTTT 9480 

TTCCTGATTT TTGCTATCTT TTTTTCACCC GATTGACTAA ACAGATAGTA ACCAATAGAA 9540 

GAAAACAAGG CTTGACCAAT AAAAATCAAA AACGATTGAA ATACTTTGGA GCCTATATTT 9600 

TCAATAGAAC TCCCATCTAT TAAATCCTTT AAGATAAGGG GAAGCAACAA AGCAAGTAGA 9650 

CTAGACAGAA CAAGTAAGAA ACTCCCCATA ATCACCTTAG TATCTACTCT TAATAATTTT 9720 

AATTTCATAA ATACTCCTTA TAATATTTCA ACGGATAAAG TCGGGAATAA CTCAATTTGA 9780 

GGATAAAATC TAATAAATCT TCCTATAACA AAACGCATAA CATCTAGGAT TTTATATACC 9840 

TGATATTATG CGTTTTTAAG CACAAAGACT TCTTACACAA ACTTATCTAC AATTAGATTT 9900 

TATTTGACAT GTTTTGCCAA TTCTTCTTGG GCTTTTTTAT TGGATTCTTC TTTTTCTTTC 9960 

AACCATTTTT CTCTGGCTTT TGCATATTCG TCTGTTGTGA CAATCTTATC TTGTACTTTG 10020 

AGGTATTTAT ATGATTCAAC CCCTTTTGTA CCGGTTAAAC CATAGGCAGC AGCAAATGGT 10080 
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ACGGTTCTTC 


TCAATGATGG 


TGTTCCCCCA 


CGCGAAACAC 


TTGGAAGAAC 


TAAAGAACTA 


10140 


TCAATCAACC 


AAGCTTGAAT 


ATCAGCATAT 


TTCTCATAAC 


GTTTGGCCGG 


ATCTTGCTCT 


10200 


TTATTAGCTT 


CTTCCAACAT 


TTGAGTATAG 


ACATCCAGTC 


CAACTGCCTT 


AGCCTTGTCA 


10260 


TTGGCCTCAC 


CAGGCTCTAG 


TCCAAGATTT 


TGCAGAAATC 


CTCCACTATT 


AGTATTAAAA 




ATATCGAGAT 


AGGTTGACGG 


GTCTTGATAA 


TCAGCYTCCCC 


AAPPGPPATG 


ATATAAATPA 
ri xm Ann x v_ A 


i m q n 
1U J 0 u 


TAATCTTTCT 


GAGCAGCTGT 


TTGAGPAAAG 


TAGPPTGAAP 


1U1 V.nnn(. 1 \— 


nil. I\j 1 i 




AATTGCTGAA 


TGTCAATCAC 


TACATTATCA 


GAAPPTAAAA 


PAGATTPAAT 


i. vin 1 iul 1 Hi 




ATAGAACTAA 


CTCCTTGTAT 


GCCTACTTTA 


, PPT*P'T v r A PTT 


PP AP A fyPfTT* 




1U3DU 


ATTGGGAATT 


GAACACCCTT 


TGCTTCGAGT 


X L X X X L X 1 rt\J 


rTTrrnp a a a 


Lil AV»L LI lb 


lUbJ U 


GCTTTCTCAG 


GATTGTAGTA 




PPATPPfiPAA 


MO 1 i uA x HLv. 


i luCLAl I v_L. 


10680 


TTACCATAGT 


TGACCATPTT 


nVJtnvjxjv. J nLn 


nv> 1 x \_ A\_LAA 


AGTCTTTTC C 


til un 1 At, lu 


10740 


ACAAAGTTTG 


GAGGAACCAC 


TARfJTTAPfiP 


& & & Anrv^TTT^ 


1 luLALUl 1 (— 


TTTCCCTTCA 


10800 


GACTGAGCCC 


CATAAGATGT 


TPTYSTPAAAA 


fiPA AAATTrtA 
uWUUAn 1 1 yjr\ 


1 nut t I vinl. v 


GAAGTTTTTA 


10860 


TTGAGAACTG 


CTTCCTGAGT 

X *. V- V- X Onv X 




1 Lnn x vj 1 v. At 


TTGTTTTAGA 


ALj 1 A 1 An 1 J. Vj 


10920 


TAAGACTTCC 


TATCTAtSfiTT 

X #» 1 vl l\vO X X 


/Uinn 1 1 Anno 


Ann 1 A lijAAo 


TTGAATTTTG 


t- A I AO, r AT AG 


10980 


ATGATATTGT 


TTTTflT ATT T 

* * X a \J X f\ X X X 


1 1 v> 1 x X/inl\. 


v-V— 1 1 L»ni nuL 


1 uvrnljt- lu i 1 


b a r r b j\ r* i\ 


11040 


CGAGCCGTAG 


TATAAGCACC 


AGC/PGTAAAA 


TTAPfJTTPPA 


ri'nr: b ttv^t'i t* 

u Lun 1 X \_ 1 l<j 


u 1 LuL 1 ALLn 


11100 


TCATAGTAGG 


TCAATTTCAC 


ATC , GTP m APA 


AAGAPATTPT 


1 AULA 1 V. tt A 


prpi JITTIPPP 
ulnnl I ALjLtVj 


11 1 fau 


TTTTTCTTAT 


ATTCAATAGC 


AGATTTTGAG 


APAAGTGPTT 


TPATPAAf^AA 
X »— A 1 Lrt/VJ AA 


MVjo 1 LLA 1 lu 


1 1 oon 
llZ/u 


TACAAAATAC 


TAGATGGATC 


CGCCTTCCCA 


aaatpa tppp 

nnn x. wn x \_ \_ v«. 


PT w P'I' r Pf2AT M P r P 
k>i 1 1 xuni L J 


V — rtou/irtn lL 1 




GCATTAACAG 


GAAAAAGTAT 


CGTTGCAAGT 


GTTTTTGAAT 


TPPAGTAAAP. 


i x L. lyro 1 1 In 




ACCAAAGTAT 


ATTGAACCGT 


TTGGTCATCA 


AGTGCCTTGA 


CACCGACAGT 


TGAAAAGTPG 


114 00 


CTTGTTTTAC 


CAGTGATATA 


GTCATCCAAA 


CCAGCAACAG 


AGTCCTGCAC 


TAGATACAAG 


11460 


GCTTCTGATT 


TTTTATCAGC 


TGCATATTGC 


AAACCTGTCA 


CAAAATCCTG 


GGCAGTTACA 


11520 


GGCGCATATT 


CTTCTCCCTC 


AGAAGTAAAC 


CACTTGGCAT 


CCTTACGAAG 


TTTGTAGGTA 


11580 


TAGGTCAAAC 


CGTCCTGAGA 


AACAGTCCAA 


TCCTCTGCTA 


ATGATGGAAT 


AATATTCCCA 


11640 


TATTGGTCAT 


TTTCTAATAA 


CCCGTCTACC 


AAATTTGCAA 


CAATATCGGA 


TGTTGCTGCG 


11700 


CGGTTTTCTG 


CTAGATAGTT 


CAAGCTAGAT 


GGATCACTTG 


AATAAACATA 


GTTGTAGGTT 


11760 


TTTGACGCCG 


TGCTAGAATT 


TCCACACGCG 


CTCAATAAAA 


CTCCTGTACC 


CAGGACAAGA 


11820 


CCTGCCAAGG TTAGATATTT 


GCTCTTAGAC 


TTTTTCATTT 


CCGG 




11864 
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(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2412 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 62: 



TAACTGCACT 


AAACATAATA 


TAAGGAGAGA 


AAATGTCTGC 


AATAGAACGT 


ATTACAAAAG 


60 


CTGCTCACTT 


AATTGATATG 


AACGATATTA 


TCCGTGAAGG 


GAATCCTACT 


CTACGCGCGA 


120 


TTGCTGAGGA 


AGTCACTTTC 


CCCCTATCTG 


ACCAGGAAAT 


CATCCTAGGC 


GAAAAGATGA 


180 


TGCAATTCCT 


TAAACATTCC 


CAAGATCCTG 


TCATGGCTGA 


AAAAATGGGA 


CTCCGCGGTG 


240 


GTGTTGGACT 


GGCTGCTCCC 


CAGTTAGATA 


TCTCAAAACG 


CATTATCGCT GTTTTGGTAC 


300 


CTAATATTGT 


TGAAGAAGGC 


GAAACTCCAC 


AGGAAGCCTA 


CGATTTGGAA GCCATTATGT 


360 


ACAATCCAAA 


AATCGTCTCT 


CACTCTGTTC 


AAGATGCTGC 


TCTTGGCGAA GGAGAAGGTT 


420 


GCCTGTCTGT 


TGACCGTAAC 


GTGCCTGGCT 


ATGTTGTTCG 


CCATGCCCGC 


GTTACTGTTG 


480 


ACTACTTTGA 


CAAAGATGGA 


GAAAAACACC 


GTATCAAACT 


CAAAGGCTAC 


AACTCCATTG 


540 


TTGTTCAGCA 


TGAAATTGAC 


CACATTAACG 


GTATCATGTT 


TTACGATCGC 


ATCAATGAAA 


600 


AAGACCCATT 


TGCAGTTAAA 


GATGGTTTAC 


TGATTCTTGA 


ATAAAGAAAA 


TCCCGTTGCA 


660 


AGACGGGGTT 


TTGTGTTATA 


ATAGAGGCAT 


GAAAACAAAT 


GATATTGTCT 


ATGGTGTCCA 


720 


CGCCGTTACC 


GAAGCCCTCC 


TTGCAAATAC 


AGGAAACAAA 


CTCTACCTCC 


AAGAAGATCT 


780 


CCGAGGTAAG 


AATGTTGAGA 


AAGTCAAGGA 


ACTAGCTACA 


GAAAAGAAGG 


TGTCCATTTC 


340 


TTGGACATCA 


AAAAAATCTC 


TCTCTGAGAT 


TACTGAAGGT 


GCTGTTCATC 


AAGGTTTTGT 


900 


TCTACGAGTG 


TCTGAATTTG 


CCTATAGCGA 


GCTAGATTAC 


ATCCTTGCAA 


AAACACGCCA 


960 


AGAAGAAAAT 


CCACTTCTAT 


TGATTCTAGA 


TGGTCTAACC 


GATCCCCATA 


ATCTGGGTTC 


1020 


TATCTTGCGA ACAGCCGATG 


CGACCAATGT 


TTCAGGTGTC 


ATCATTCCCA AGCACCGTAC 


1080 


TGTCGGAGTA 


ACTCCTGTCG 


TTGCCAAAAC 


AGCCACAGGT 


GCTATTGAAC 


ACGTtCCAAT 


1140 


TGCCCGAGTG 


ACCAACCTCA 


GTCAAACCTT 


AGGATAAACT 


TAAGGATGAA 


GGTTTCTGGA 


1200 


CCTTTGGAAC 


GGATATGAAC 


GGTACTCCTT 


GCCACAAGTG 


GAATACAAAA 


GGGAAAATCG 


1260 


CCCTCATCAT 


TGGAAATGAA 


GGAAAAGGTA 


TCTCTAGCAA 


CATCAAAAAA 


CAGGTCGATG 


1320 


AAATGATTAC 


CATTCCGATG 


AATGGACATG 


TTCAAAGCCT 


TAATGCCAGT 


GTTGCTGCGG 


1380 
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CCATTCTCAT GTACGAAGTT TTCCGAAATA GACTATAAAA 
GGAAACTTTT TTATGATTAA CTATGTTCTG TAATGAATTT 
TAGCTCCATC TCCAACCGCT GTTGTTACTT GGCGAAGGTC 
CTGCAAAGAT ACCGTCGACT GCAGTTTTCA TGTGGTTATC 
GATCTTGGAT ATTCAATTCT TTAACAAAAT CGCTAAGAGG 
AGACACCACC GAAGGCTTGT TCTGTCACTT GACCTGTTTT 
ATTCTACTCG GTTTTCACCC TTGATTTCCC TTACTACAGA 
TTTCATTCGC AAAGGCGCGA TCTTGTAAAA CCTTTTGGGC 
GAACAATGGT AACAGTCTTA GCAAAACGAG TCAAGAAGAG 
CTCCACCACC AACTACCAAT AAATCTTGGT CACGGAAGAA 
AGTAAGAAAC ACCACGACTG TTCAGTTCTT CTTCTCCAGG 
TAGAACCAGT TGCTACGATA ACTGTACGTG TTTCATATGT 
TCTTAAAATC ACCATGGCTT CGACATTTTC AACATAACCA 
ATTTTCAAGT GGTTCAAACA TCTTTTCAGC CAATTCAGGT 
TGGGTAATTT TCGATATCAG ATGTATTATT CATCTGACCA 
CAAAGCTACT TTTAGATTGC TTCGAGCAGC ATACAAGGCC 
AGCACCGATA ATAATAGTAT CGTACATATA GATTCCTTCT 
ATTCTAACTC TG 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7760 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



AAGTTTCCAG 
ATAGGCTTCT 
TTTCAAGCGA 
TGTCACAATC 
GTCCAAACCA 
CACATTTTCA 
ATCCCAGATA 
ACGAAGTTGG 
GGCTTCTTCA 
AGCACCATCA 
CACTCCCAAA 
TTGGTCATCA 
TAAATGTGCT 
CCACTAATAT 
CCTGGCAGAC 
GCAGTCATCC 
TTCTTGGTGT 



TCATCTGATT 
TGACCAGCGA 
ACATCTCCAA 
CATCCTGCCT 
ACATAGATAA 
AATACGACTG 
AAGCTGATTT 
TCACGACGGT 
ACAGCTGAAT 
CACACAGCAC 
GGACGGTGTT 
GTCATCACTT 
CAACACCAAG 
TAGCGTATCC 
CACCTTCAAT 
CTGCAGGTCC 
AACTATCTTT 



1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2412 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 
CCGATTTGGT GGAATTTTTG TCTCATCATT TAGAAGGTGT TGCAAGAGCA GAGTTTACCT 
TGGTGCTTCA TACCAAATTG GGAGAAGCCT CTGTTTTGGC AAATATTGTA GATGTAAACA 
AGGATGAATG GATTTTAGGA ACAGTTGCTG GTGCCAATAC CTTATTGGTT ATTTGTCGAG 
ATCAGCACGT TGCCAAACTC ATGGAAGATC GTTTGCTAGA TTTGATGAAA GATAAGTAAG 
GTCTTGGGAG TTGCTCTCAA GACTTATTTT TGAAAAGGAG AGACAGAAAA TGGCGATAGA 
AAAGTTATCA CCCGGCATGC AACAGTATGT GGATATTAAA AAGCAATATC CAGATGCTTT 



60 
120 
180 
240 
300 
360 
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TTTGCTCTTT 


CGGATGGGTG 


ATTTTTATGA 


ATTATTTTAT 


GAGGATGCGG 


TCAATGCTGC 


420 


GCAGATTCTG 


GAAATTTCCT 


TAACGAGTCG 


CAACAAGAAT 


G CCG ACAATC 


CGATCCCTAT 


480 


GGCGGGTGTT 


CCCTATCATT 


CTGCCCAACA 


GTATATCGAT 


GTCTTGATTG 


AGCAGGGTTA 


540 


TAAGGTGGCT 


ATCGCAGAGC 


AGATGGAAGA 


TCCTAAACAA 


GCAGTTGGGG 


TTGTTAAACG 


600 


AGAGGTTGTT 


CAGGTCATTA 


CGCCAGGGAC 


AGTGGTCGAT 


AGCAGTAAGC 


CGGACAGTCA 


660 


"GAATAATTTT 


TTGGTTTCCA 


TAGACCGCGA 


AGGCAATCAA 


TTTGGCCTAG 


CTTATATGGA 


720 


TTTGGTGACG 


GGTGACTTTT 


ATGTGACAGG 


TCTTTTGGAT 


TTCACGCTGG 


TTTGTGGGGA 


780 


AATCCGTAAC 


CTCAAGGCTC 


GAGAAGTGGT 


GTTGGGTTAT 


GACTTGTCTG 


AGGAAGAAGA 


840 


ACAAATCCTC 


AGCCGCCAGA 


TGAATCTGGT 


ACTCTCTTAT 


GAAAAAGAAA 


GCTTTGAAGA 


900 


CCTTCATTTA 


TTGGATTTGC 


GATTGGCAAC 


GGTGGAGCAA 


ACGGCATCTA 


GTAAGCTGCT 


960 


CCAGTATGTT 


CATCGGACTC 


AGATGAGGGA 


ATTGAACCAC 


CTCAAACCTG 


TTATCCGCTA 


1020 


CGAAATTAAG 


GATTTCTTGC 


AGATGGATTA 


TGCGACCAAG 


GCTAGTCTGG 


ATTTGGTTGA 


1080 


GAATGCTCGC 


TCAGGTAAGA 


AACAAGGCAG 


TCTTTTCTGG 


CTTTTGGATG 


AAACCAAAAC 


1140 


GGCTATGGGG 


ATGCGTCTCT 


TGCGTTCTTG 


GATTCATCGC 


CCCTTGATTG 


ATAAGGAACG 


1200 


AATCGTCCAA 


CGTCAAGAAG 


TAGTGCAGGT 


CTTTCTCGAC 


CATTTCTTTG 


AGCGTAGTGA 


1260 


CTTGACAGAC 


AGTCTCAAGG 


GTGTTTATGA 


CATTGAGOGC 


TTGGCTAGTC 


GTGTTTCTTT 


1320 


TGGCAAAACC 


AATCCAAAGG 


ATCTCTTGCA 


GTTGGCGACT 


ACCTTGTCTA 


GTGTGCCACG 


1380 


GATTCGTGCG 


ATTTTAGAAG 


GGATGGAGCA 


ACCTACTCTA 


GCCTATCTCA 


TCGCACAACT 


1440 


GGATGCAATC 


CCTGAGTTGG 


AGAGTTTGAT 


TAGCGCAGCG 


ATTGCTCCTG 


AAGCTCCTCA 


1500 


TGTGATTACA 


GATGGGGGAA 


TTATCCGGAC 


TGGATTTGAT 


GAGACTTTAG 


ACAAGTATCG 


1560 


TTGCGTTCTC 


AGAGAAGGGA 


CTAGCTGGAT 


TGCTGAGATT 


GAGGCTAAGG 


AGCGAGAAAA 


1620 


CTCTGGTATC 


AGCACGCTCA 


AGATTGACTA 


CAATAAAAAG 


GATGGCTACT 


ATTTTCATGT 


1680 


GACCAATTCG 


CAACTAGGAA 


ATGTGCCAGC 


TCACTTTTTC 


CGCAAGGCGA 


CGCTGAAAAA 


1740 


CTCAGAACGC 


TTTGGAACCG 


AAGAATTAGC 


CCGTATCGAG 


GGAGATATGC 


TTGAGGCGCG 


1800 


TGAGAAGTCA 


GCCAACCTCG 


AATACGAAAT 


ATTTATGCGC 


ATTCGTGAAG 


AGGTCGGCAA 


1860 


GTACATCCAG 


CGTTTACAAG 


CTCTAGCCCA 


AGGAATTGCG 


ACGGTTGATG 


TCTTACAGAG 


1920 


TCTGGCGGTT 


GTGGCTGAAA 


CCCAGCATTT 


GATTCGACCT 


GAGTTTGGTG 


ACGATTCACA 


1980 


AATTGATATC 


CGGAAAGGGC 


GCCATGCTGT 


CGTTGAAAAG 


GTTATGGGGG 


CTCAGACCTA 


2040 


TATTCCAAAT 


ACGATTCAGA 


TGGCAGAAGA 


TACCAGTATT 


CAACTGGTTA 


CAGGGCCAAA 


2100 
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CATGAGTGGG AAGTCTACCT ATATGCGTCA GTTAGCCATG ACGGCGGTTA TGGCCCAGCT 2 ISO 

GGGTTCCTAT GTTCCTGCTG AAAGCGCCCA TTTACCGATT TTTGATGCGA TTTTTACCCG 2220 

TATCGGAGCA GCAGATGACT TGGTTTCGGG TCAGTCAACC TTTATGGTGG AGATGATGGA 2280 

GGCCAATAAT GCCATTTCGC ATGCGACCAA GAACTCTCTC ATTCTCTTTG ATGAATTGGG 2340 

ACGTGGAACT GCAACTTATG ACGGGATGGC TCTTGCTCAG TCCATCATCG AATATATCCA 2400 

TGAGCACATC GGAGCTAAGA CCCTCTTTGC GACCCACTAC CATGAGTTGA CTAGTCTGGA 2460 

GTCTAGTTTA CAACACTTGG TCAATGTCCA CGTGGCAACT TTGGAGCAGG ATGGGCAGGT 2520 

CACCTTCCTT CACAAGATTG AACCGGGACC AGCTGATAAA TCtACGGTAT CCATGTTGCC 2580 

AAGATTGCTG GCTTGCCAGC AGACCTTTTA GCAAGGGCGG ATAAGATTTT GACTCAGCTA 2640 

GAGAATCAAG GAACAGAGAG TCCTCCTCCC ATGAGACAAA CTAGTGCTGT CACTGAACAG 2700 

ATTTCACTCT TTGATAGGGC AGAAGAGCAT CCTATCCTAG CAGAATTAGC TAAACTGGAT 2760 

GTGTATAATA TGACACCTAT GCAGGTTATG AATGTCTTAG TAGAGTTAAA ACAGAAACTA 2820 

TAAAACCAAG ACTCACTAGT TAATCTAGCT GTATCAAGGA GACTTCTTTG ACAATTCTCC 2880 

ACTTTTTTGC TAGAATAACA TCACACAAAC AGAATGAAAA GGAGCTGACG CATTGTCGCT 2940 

CCCTTTTGTC TATTTTTTAA GGAGAAAGTA TGCTGATTCA GAAAATAAAA ACCTACAAGT 3000 

GGCAGGCCCT GGCTTCGCTC CTGATGACAG GCTTGATGGT TGCTAGTTCA CTTCTGCAAC 3060 

CGCGTTATCT GCAGGAAGTC TTAGGCGCCC TCCTTACTGG GAAATATGAA GCTATTTATA 3120 

GTATCGGGGC TTGGTTGATT GGTGTGGCCG TAGTCGGTCT AGTTGCTGGT GGACTCAATG 3180 

TTGTCCTCGC AGCCTATATT GCCCAAGGAG TTTCATCCGA CCTTCGGGAG GATGCCTTCC 3240 

GTAAAATTCA AACCTTTTCT TATGCTGATA TTGAACAATT TAATGCGGGA AATCTAGTCG 3300 

TTCGAATGAC AAATGATATC AACCAGATTC AGAACGTTGT CATGATGACC TTCCAAATTC 3360 

TTTTCAGACT TCCCCTCTTG TTCATCGGTT CGTTTATCCT AGCGGTTCAA ACCTTACCTT 3420 

CTCTGTGGTG GGTGATTGTT CTCATGGTAG TCTTGATTTT TGGTTTGACT GCTGTCATGA 34 80 

TGGGAATGAT GGGGCCTCGT TTTGCCAAGT TTCAAACCCT TCTTGAGCGC ATCAATGCCA 3540 

TTGCCAAGGA AAATTTACGT GGCGTTCGTG TGGTCAAGTC CTTTGTCCAA GAAAAAGAGC 3 600 

AATTTGCTAA GTTTACAGAG GTCTCAGACG AGCTTCTTGG TCAAAACCTT TACATTGGTT 3660 

ATGCCTTTTC AGTAGTGGAA CCCTTTATGA TGTTGGTTGG TTACGGGGCG GTCTTCCTCT 3720 

CTATTTGGCT GGTCGCGGGA ATGGTTCAGT CGGATCCGTC TGTTGTTGGT TCCATCGCTT 37 80 

CTTTTGTTAA TTACCTAAGC CAGATTATCT TTACCATTGT TATGGTTGGA TTTTTGGGAA 3 840 

ATTCTGTCAG CCGTGCCATG ATTTCCATGC GTCGTATTCG AGAAATTCTT GACGCAGAGC 3 900 
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CAGCTATGAC 


CTTCAAGGAT 


ATCCCAGATG 


AAGAGTTGGT 


TGGAAGTCTT 


AGCTTTGAAA 


3960 


ATGTGACCTT 


TACCTATCCA 


ATGGACAAGG 


AACCGATGCT 


GAAAGATGTG 


AGCTTTACTA 


4020 


TTGAACCTGG 


TCAAATGGTT 


GGTGTAGTTG 


GAGCGACTGG 


TGCAGGAAAG 


TCAACCTTGG 


4080 


CTCAATTGAT 


TCCACGTCTC 


TTTGATCCAC 


AGGACGGGGC 


CATTAAAATC 


GGTGGCAAGG 


4140 


ATATTCGAGA 


AGTGAGTGAA 


GGAACCCTGC 


GTAAAACAGT 


TTCCATCGTT 


CTCCAACGTG 


4200 


CCATTCTTTT 


TAGTGGAACG 


ATTGCAGATA 


ACTTGAGACA 


GGGGAAGGGG 


AATGCTACTC 


4260 


TATTTGAAAT 


GGAGCGCGCA 


GCCAATATTG 


CCCAGGCTAG 


TGAATTCATT 


CATCGTATGG 


4320 


AGAAAACCTT 


TGAAAGTCCA 


GTTGAAGAAC 


GGGGAACCAA 


TTTCTCTGGT 


GGACAAAAAC 


4380 


AAAGGATGTC 


GATTGCGCGT 


GGGATTGTCA 


GCAATCCACG 


TATTCTGATT 


TTTGATGATT 


4440 


CGACCTCAGC 


CTTGGATGCC 


AAATCAGAGC 


GCTTGGTGCA 


AGAAGCTTTG 


AATAAGGACT 


4500 


TGAAGGGGAC 


GACAACCATT 


ATTATTGCTC 


AAAAAATTAG 


CTCGGTTGTC 


CATGCAGACA 


4560 


AGATCTTGGT 


TCTAAATCAA 


GGACGATTGA 


TTGGTCAAGG 


TACGCATGCA 


GACTTGGTTG 


4620 


CCAACAATGC 


CGTTTACCGT 


GAAATCTATG 


AAACACAGAA 


ATGAAAGACA 


AACTATAAGA 


4680 


AAAGTCAATA 


GTTTTATCTA 


AACTATTTCT 


TATTTCAATT 


TGATGATTTG 


GCGATGATTT 


4740 


TAGAGCACGG 


CAAAAAGCCC 


TTGAAAAAGT 


CCATTTTTTC 


AAAGGTAATC 


CTGTGTTAAT 


4800 


TTCAGAAATT 


ACATCACTTT 


TTGTTCGTCA 


AATGGCAGCT 


CTTTTTTTAG 


GATATAAAAC 


4860 


AGGGTTCGGA 


TAAGTTTTTT 


TGCAAGGTGG 


ATGATGGCTA 


CATTGTAATG 


TTTTCCTTGT 


4920 


TCTAATTTAG 


TCTTAAGATA 


GGCCTTAAAA 


GCAGGCGAAA 


AGCGAGGGCA 


TGCTTTGGCA 


4980 


GCTTGTATGA 


GTACCTACCG 


CAGATGAGGG 


GAACTCCGTT 


TGACCATTCT 


TCCTGCTAAA 


5040 


TCAATCTGAT 


CTGACTGATA 


AATAGAAGAA 


TCCAGTCCAG 


CGAAAGCTTG 


TAATTGAGCA 


5100 


GGATTATCAA 


AGGCATGAAT 


ATTTCGAATC 


TCAGCTAAAA 


TGACCGCCCC 


TAAACGATCC 


5160 


CCAATCCCAG 


TAACCGTCGT 


GATGACCGAG 


TTGAACTCAG 


CCATCAAGTC 


ATTGACACAT 


5220 


GTTTCCGCCT 


TGTCAATGAG 


CCTCTTGTAA 


TGTTTGATGT 


TTTCATTACA 


CGAGATAAAA 


5280 


CGTCTATGCG 


TTATCAAACT 


CATTACCAAT 


TAAAACAAAA 


AGCTGTGGTT 


AGATCCTTTC 


5340 


GGAAATTGTC 


AAGCGATTGG 


AGGAAATGAA 


CTAATCCACA 


GCGGCTTATT 


CCAAGTATAC 


5400 


CACTTGGGCT 


TTGGCAGTAG 


CTAACTGCGC 


TAAATATAAT 


ATAAGGAGGA 


GTAAAATGAA 


5460 


GACAGTTCAA TTTTTTTGGC ATTATTTTAA GGTCTACAAG 


TTCTCATTTG 


TAGTTGTCAT 


5520 


CCTGATGATT 


GTTCTGGCGA 


CTTTTGCCCA 


AGCCCTCTTT 


CCAGTCTTTT 


CTGGACAAGC 


5580 


GGTGACGCAG 


CTAGCCAATT 


TAGTTCAAGC 


TTATCAAAAT 


GGCAATCCAG 


AACTTGTATG 


5640 
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GCAAAGCCTA TCAGGAATCA TGGTCAATCT TGGCCTGCTG GTTTTGGTTC TATTTATCTC 5700 

TAGTGTAATA TACATGTGTC TCATGACGCG CGTGATTGCA GAATCGACCA ACGAGATGCG 5760 

CAAAGGCCTC TTTGGTAAGC TTGCTCAGTT GACGGTTTCT TTCTTTGACC GTCGACAAGA 5820 

TGGCGATATC CTGTCTCATT TTACCAGTGA TTTGGATAAT ATCCTCCAAG CCTTTAACGA 5880 

AAGCTTGATT CAGGTCATGA GCAATATTGT TTTATACATT GGTCTGATTC TTGTCATGTT 5940 

TTCGAGAAAT GTGACGCTGG CTCTCATCAC CATTGCCAGC ACCCCATTGG CTTTCCTTAT 6000 

GCTGATTTTC ATCGTGAAAA TGGCACGCAA ATACACCAAC CTCCAGCAGA AAGAGGTAGG 6060 

GAAGCTCAAC GCCTATATGG ATGAGAGCAT CTCAGGCCAA AAAGCCGTGA TTGTGCAAGG 6120 

AATTCAAGAG GATATGATGG CAGGATTTCT TGAACAAAAT GAGCGCGTGC GCAAGGCAAC 6180 

CTTTAAAGGA AGAATGTTCT CAGGAATTCT TTTCCCTGTC ATGAATGGGA TGAGCCTGAT 6240 

TAATACAGCC ATCGTCATCT TTGCTGGTTC GGCTGTACTT TTGAATGATA AGTCTATTGA 6300 

AACAAGTACA GCCCTAGGTT TGATTGTTAT GTTTGCACAA TTTTCACAGC AGTACTACCA 6360 

GCCTATTATC CAAGTTGCAG CGAGTTGGGG AAGCCTTCAG TTGGCCTTTA CTGGAGCTGA 6420 

ACGAATTCAG GAAATGTTTG ATGCAGAGGA GGAAATCCGA CCTGAAAAGG CTCCAACCTT 6480 

CACTAAGTTG CAAGAAAGTG TTGAAATCAG TCATATCGTT TTTTCATACT TGCCTGATAA 6540 

ACCTATTTTG AAAGATGTCA GCATTTCTGC CCCTAAAGGC CAGATGACAG CAGTTGTTGG 6600 

GCCGACAGGT TCAGGAAAAA CG ACT ATT AT GAACCTCATC AATCGCTTTT ATGATGTTGA 6660 

TGCTGGTGGT ATTTATTTTG ATGGTAAAGA CATTCGTGGC TATGACTTAG ATAGTCTTAG 6720 

AAGCAAGGTG GGAATTGTAT TGCAAGATTC GGTCTTGTTT AGCGGAACGA TTAGAGACAA 6780 

TATCCGATTT GGTGTGCCAG ATGCTAGTCA GGAAATGGTT GAGGTAGCAG CAAAAGCAAC 6840 

CCACATTCAC GACTATATCG AAAGTTTGCC TGATAAGTAC GATACTCTTA TTGATGATGA 6900 

CCAGAGCATC TTTTCAACAG GGCAGAAGCA ATTGATTTCA ATCGCTCGAA CCCTGATGAC 6960 

AGATCCAGAA GTTCTCATTC TCGATGAAGC AACTTCAAAC GTAGATACGG TGACAGAAAG 7020 

CAAGATTCAG CATGCCATGG AGGTGGTTGT AGCAGGTAGA ACTAGTTTCG TCATTGCCCA 7080 

CCGCTTGAAA ACCATTCTCA ATGCAGATCA GATTATTGTC CTTAAAGATG GAGAAGTCAT 7140 

TGAACGTGGT AACCACCATG AACTTTTGAA GCTAGGTGGC TTTTATTCAG AACTCTATCA 7200 

CAATCAATTT GTTTTCGAAT AAGAAAGAAG TTGTCCTATG TGGGCAGCTT TTTCTTGTCC 7260 

ATAAAAAATG TTTATCACAG CCTTAAAAAA AACATATTAG ACGAAAGTCA TTTTGAGTGA 7320 

TATGATAGGA CTATCGTTAG CATTCGAAAG GAGAGGCATC ATGGCTAGAA CGGTTGTAGG 7380 

AGTTGCTGCA AATCTATGTC CCGTAGACGC AGAAGGCAAA ATCATTCATT CATCTGTATC 7440 
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TTGTAGATTC GCAGAGATCA TTCGTCAAGT CGGTGGTCTC CCTTTAGTCA TTCCTGTTGG 7500 

TGATGAGTCA GTTGTACGTG ATTATGTGGA AATGATTGAC AAACTCATTT TGACAGGAGG 7560 

CCAAAATGTT CATCCTCAGT TTTATGGAGA GAAAAAGACC GTCGAGAGCG ATGATTACAA 7620 

TCTGGTCCGT GACGAATTTG AATTGGCACT CTTGAAGGAA GCGCTTCGTC AGAATAAACC 7680 

AATTATGGCA ATCTGTCGCG GTGTCCAACT TGTCAATCTT GCCTTTGGTG GAACCCTCAA 7740 

TCAAGAAATC GAAGGTCAGG 77 60 
(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2723 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

GAGGTTTTAA TTCACTTACC TCTsCCGTAT CTTTATTTAA AATGAATTCT TTTACGGTTG 60 

TATTTCTTGC AAAATCTTTT ACAACAATCT TAATGTTTAG TGTCTTGTCT ATTATTTGTT 120 

TAATATCATT AAATGATGTA TATTCTTTTC CATTTATATA AATATGTTGT TCTTGAATCT 180 

CACCATCGAA TCCATTATTT CTTTTATCAT TGATGTTAAA GACTACAGAT TTTCCATCAG 240 

CATATTCGAT ACTAGTATTT CCCTTAGGAT CAATGTTTAC TTCGGGTTTA AC ATT AT CAT 300 

ATAAAAACTG ATAGTGGACT CCAACTGCTT TAGCATTCAA ATCGCTATAG CCAGTTTGAA 360 

GATAAACATT TCCATCCATA TCTGTTACCT TATCTGGAAA TCCGTTTGCT TTATAGTCTT 420 

TCATTCCCCA GTCCATGATG TCACCGTCTT TAACATTCAG CTTAATATTA AAATCTCTAG 480 

TGTTATCAAT GTGTAAATCT CCGTAGATTA AATAATTATC TACAACCGAT TCATTAACTC 540 

TCAATTCCCA GTTAAAACCA CCCTTATCAG AAATCTTACC TCTTAAATAA AATTCTGGAT 600 

TTCGTACATA AATTTTATTA GATTTAGATG GATTAAAGTA GTTCTTATCC ATTGAAAGGT 660 

TTACTGGTTT GGTATCAATA AATAACATGG AGCCATCTTC TTTTATAGCT TCTACATTGA 720 

ACTTATCCTC TCCAGTGTAT TCTTTATCAT CCTTACCAAA TAATACAAGT TTAGAAGAAT 780 

CTGTCACAAG ATTTCCGTCT TTATCGATAG CTTCCCCTTT ATCGTTCATT TTAAATGTAA 840 

ACACTTGATA CCTTATAATG TTAAAGCCGT CCAAAGCCGA CATTAATACA GATTGGGTAC 900 

TTCTTCCATC TTCAACATTT CTACTATCAG CATAAATTGT TGTTTCTGAA AGGGCTCTTA 960 

GATTAGGATT GGCCTTTTGT ATTTTTGCTA TATCTTCCTT GCTATAGACT CCATTTCCTT 1020 
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CTAACATATC 


CGTTTTTCCA 


GGATTATAGG 


TAGTCACTTT 


TAGTGCATAG 


CCTTTTCTTA 


1080 


/"* X X flip x m 7v irwn 
IjAA i\iATA TT 


ATCCTTTAAC 


AGATATTGTT 


GTTTTTCTGA 


ATCAGAATAG 


ATTTTACCAG 


1140 


X wwop x mtivnfn 

Ai iCCATTTT 


AGTTAAATTG 


TCTGGTTTGT 


TTTTTGAAAG 


ATCTCCTTCC 


CCTAATTCTA 


1200 


J. wAvJA II CCC 


ATAACTTGAT 


ACATAGGGAT 


ATTCTGATTT 


AGTTTCCTTA ATTTTTTCAG 


1260 


tjLAI ICTAAT 


TTTAATTTCA 


GCTTTTTTCT 


GATCATTATC 


TTT A. 2 P & A &. T 
1 I 1 AnLAAA i 


AAI U.IL.A1 AT 


1320 


CTCCTGCAAA 


AGCTAATCCA 


TCCACAATAT 


CATTAATATT 


AL»L\j 1 A i ACjA 


TCAAATGTCA 


1380 


TCGTTTTTGA 


GTGGAAATCA 


TACTTGGTCG 


CTTTGATTTC 


m x m x s\ x fitftmi x 

rATAGATTTA 


TAGTTATTCC 


1440 


CATAATATAC 


CTTGGCATTT 


TTAGAAACAT 


TACTTATCTT 


TCCAAGAATT 


TCAAAGTGTC 


1500 


CATCTTTAGA 


CGGACTTAGA 


ACACCATAAA 


TTTTTGATTT 


GATTTCGTCA 


AGTTTCTCAG 


1560 


TTTCATATTC 


TAGATCAGTC 


CCATCATCGT 


AGGCTATTAT 


ATTTCCTTTA 


TCATCGTATT 


1620 


TATAATCGTA 


TTCCTCCATT 


CTCTTACCAG 


TTTCACTTGT 


AAAATCATCA 


ACTTCTCTAA 


1680 


ATTTCTTTTT 


AATGAGTTTC 


TTTAAGTCTT 


TATTTTCAAA 


GTCTCTAATT 


GTTGAAATAT 


1740 


TTCTATCAAT 


AGTAAAACTA 


GATTTTTCTT 


TAATAGACTC 


TTCATTTTCT 


TGATGATGAT 


1800 


GTTCTACCCC 


AGTTGTATCT 


TTTTTTAGAC 


TACCCTCTTT 


TCCATTTCCT 


AAATTTTTAA 


1860 


ATTTAGATTC 


TGCAATCTCG 


CCAAGCTTTT 


GATATTTAGA 


TGAATCTTGA 


TCAGGATCTA 


1920 


CTAGATAATA 


GGAAATCATC 


CCCTTTTCAT 


CAGCCTGATT 


x r"* x xx mrrvn x 

AwLAAATTTA 


ATTCTATGAA 


1980 


TCTTTGTGAA 


ATTGCTAGAA 


CCATCTAATG 


CAATGACTTC 


X. X r P/" > X fTWI*fIWTWTi 

AA I (jA I 11 I J 


CCCCTTAAAT 


2040 


CTCCCGCACC 


TTTAATTTCA 


TAAATGGTAT 


TTCCGTCTTT 


A I CAAGTTTT 


CTATTTCTTC 


2100 


CTTGACCCTC 


ACCTGCGTAA 


GTTACTTCAA 


GATTTTTTTC 


AACCTCTCCA 


TCTTCATTAA 


2160 


CAAGAGCGGC 


GCCAGCATAC 


CAAACTTCGT 


TCGCAATCTC 


GTCAAATTTT 


TCAGGATGTT 


2220 


t-TTTTTGATC 


TCTCGCAAAT 


AGCGTTTCAT 


TCTTATACTG 


ATCTTTTACC 


TTATGATAAG 


2280 


TATCCTTTGT 


AATCAACTTA 


ATTTTTTCAG 


GATTTGAAAA 


ATCAACCGAA 


ACAATCTTAG 


2340 


GGGCGGTGTT 


ATCAATTTTT 


ACAGGAATAT 


AGGAAACCTG 


CCATGGGTAA 


TCTTTAGTTA 


2400 


ATCTATATTT 


AAATTTATAG 


AAATATTGAC 


CTTCCGCAAT 


CGGTTCAAAT 


TGACCTCTTA 


2460 


TCTTAGTAGC 


AGGATCTTGA 


TTATCCTTAC 


TTTCTGGTGC 


ATTTTCTTCT 


CTACCTCTAG 


2520 


GATTATAGAT 


GAGTCCATCC 


CACTTCAAGT 


CACCCCAAAC 


TTTTAGTTTA 


GATGATTTGA 


2580 


TTCCCTTTGC 


ATCATTGCTT 


TTAGAATTTA 


AAATTCCTCT 


AATAAAGTGT 


TCTCTCGAAA 


2640 


TGACTTTTAA 


GTCTCTTTGA 


TTTTCTCCCT 


CTTTATTTGT 


ATTTACTATT 


GAAATCAATC 


2700 


CTTCTTCTGC 


ACTTCTTAAT 


ACA 








2723 


(2) INFORMATION FOR SEQ ID NO: 65 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11831 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


65: 






" AAAAAAGTGG 


GAATGACTCA 


AATCTTCACT 


GAAGCTGGCG 


AATTGATCCC 


TGTAACAGTT 


60 


ATTGAAGCAA 


CTCCAAACGT 


TGTTCTTCAA 


GTTAAAACTG 


TTGAAACAGA 


CGGATACAAC 


120 


GCTATCCAAG 


TTGGTTTCGA 


TGACAAACGC 


GAAGTATTGA 


GCAACAAACC 


TGCTAAAGGA 


180 


CATGTAGCGA 


AAGCTAACAC 


GGCTCCTAAG 


CGCTTCATTC 


GTGAATTCAA 


AAACGTTGAA 


240 


GGCTTGGAAG 


TTGGTGCTGA 


AATTACAGTT 


GAAACATTCG 


CAGCTGGAGA 


CGTTGTTGAC 


300 


GTAACGGGTA 


CTTCTAAAGG 


TAAAGGTTTC 


CAAGGTGTTA 


TCAAACGCCA 


CGGACAATCA 


360 


CGTGGACCAA TGGCTCACGG 


TTCTCGTTAC 


CACCGTCGTC 


CAGGTTCTAT 


GGGGCCTGTT 


420 


GCACCTAACC 


GCGTATTCAA 


AGGTAAAAAC 


CTTGCAGGAC 


GTATGGGTGG 


CGACCGCGTA 


480 


ACAATTCAAA ACCTTGAAGT 


TGTACAAGTT 


GTTCCAGAAA 


AGAACGTTAT 


CCTTATCAAA 


540 


GGTAACGTAC 


CAGGTGCTAA 


GAAATCTCTT 


ATCACTATCA 


AATCAGCAGT 


TAAAGCTGGT 


600 


AAATAATAAA 


GAAAGGGGAA 


ATCAGTCACA 


ATGGCAAACG 


TAACATTATT 


TGACCAAACT 


660 


GGTAAAGAAG 


CTGGCCAAGT 


TGTTCTTAGC 


GATGCAGTAT 


TTGGTATCGA 


ACCAAATGAA 


720 


TCAGTTGTGT 


TTGATGTAAT 


CATCAG CCAA 


CGCGCAAGCC 


TTCGTCAAGG 


AACACACGCT 


780 


GTTAAAAACC 


GCTCTGCAGT 


ATCAGGTGGT 


GGACGCAAAC 


CATGGCGTCA 


AAAAGGAACT 


840 


GGACGTGCTC 


GTCAAGGTTC 


TATCCGCTCA 


CCACAATGGC 


GTGGTGGTGG 


TGTTGTCTTC 


900 


GGACCAACTC 


CACGTTCATA 


CGGCTACAAA 


CTTCCACAAA 


AAGTTCGTCG 


CCTAGCTCTT 


960 


AAATCAGTTT 


ACTCTGAAAA 


AGTTGCTGAA 


AACAAATTCG 


TAGCTGTAGA 


CGCTCTTTCA 


1020 


TTTACAGCTC 


CAAAAACTGC 


TGAATTTGCA 


AAAGTTCTTG 


CAGCATTGAG 


CATCGATTCT 


1080 


AAAGTTCTTG 


TTATCCTTGA 


AGAAGGAAAT 


GAATTCGCAG 


CTCTTTCAGC 


TCGTAACCTT 


1140 


CCAAACGTGA 


AAGTTGCAAC 


TGCTACAACT 


GCAAGTGTTC 


TTGACATCGC 


AAATAGCGAC 


1200 


AAACTTCTTG 


TCACACAAGC 


AGCTATCTCT 


AAAATCGAGG 


AGGTTCTTGC 


ATAATGAATT 


1260 


TGTATGATGT 


TATCAAAAAA 


CCTGTCATCA 


CTGAAAGCTC 


AATGGCTCAA 


CTTGAAGCAG 


1320 


GAAAATATGT 


ATTTGAAGTT 


GACACTCGTG 


CACACAAACT 


TTTGATCAAG 


CAAGCTGTTG 


1380 


AAGCTGCTTT 


CGAAGGTGTT 


AAAGTTGCCA 


ATGTTAACAC 


AATCAACGTA 


AAACCAAAAG 


1440 
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CTAAACGTGT TGGACGTTAC ACTGGTTTTA CTAACAAAAC TAAAAAAGCT ATCATCACAC 1500 

TTACAGCTGA TTCTAAAGCA ATCGAGTTGT TTGCTGCTGA AGCTGAATAA TCTAAGGAGG 1560 

AAATATCGTG GGAATTCGTG TTTATAAACC AACAACAAAC GGTCGCCGTA ATATGACTTC 1620 

TTTGGATTTC GCTGAAATCA CAACAAGCAC TCCTGAAAAA TCATTGCTTG TTGCATTGAA 1680 

GAGCAAGGCT GGTCGTAACA ACAACGGTCG TATCACAGTT CGTCACCAAG GTGGTGGACA 1740 

CAAACGTTTC TACCGTTTGG TTGACTTCAA ACGTAATAAA GACAACGTTG AAGCAGTTGT 1800 

TAAAACAATC GAGTACGATC CAAACCGTTC TGCAAACATC GCTCTTGTAC ACTACACTGA I860 

CGGTGTGAAA GCATACATCA TCGCTCCAAA AGGTCTTGAA GTAGGTCAAC GTATCGTTTC 1920 

AGGTCCAGAA GCAGATATCA AAGTCGGAAA CGCTCTTCCA CTTGCTAACA TCCCAGTTGG 1980 

TACTTTGATT CACAACATCG AGTTGAAACC AGGTCGTGGT GGTGAATTGG TACGTGCTGC 2040 

TGGTGCATCT GCTCAAGTAT TGGGTTCTGA AGGTAAATAT GTTCTTGTTC GTCTTCAATC 2100 

AGGTGAAGTT CGTATGATTC TTGGAACTTG CCGTGCTACA GTTGGTGTTG TCGGAAACGA 2160 

ACAACATGGA CTTGTAAACC TTGGTAAAGC AGGACGTAGC CGTTGGAAAG GTATCCGCCC 2220 

AACAGTTCGT GGTTCTGTAA TGAACCCTAA CGATCACCCA CACGGTGGTG GTGAAGGTAA 2280 

AGCACCAGTT GGTCGTAAAG CACCATCTAC TCCATGGGGC AAACCTGCTC TTGGTCTTAA 2340 

AACTCGTAAC AAGAAAGCGA AATCTGACAA ACTTATCGTT CGTCGTCGCA ACGAGAAATA 2400 

ATATTAAACT AGTCGCTTAA GCAACTAGTA AATCCGCCAG CTCGGTAGCG CTCCATAGGA 2460 

GTGCAAGCCG CTGTGGTACA ACATTTAAAG GAGAAAATAT AAAAATGGGA CGCAGTCTTA 2520 

AAAAAGGACC TTTCGTCGAT GAGCATTTGA TGAAAAAAGT TGAAGCTCAA GCTAACGACG 2580 

AAAAGAAAAA AGTTATTAAA ACTTGGTCAC GTCGTTCAAC GATCTTCCCA AGTTTCATTG 2640 

GTTACACTAT TGCAGTTTAT GACGGACGTA AACACGTACC TGTTTACATC CAAGAAGACA 2700 

TGGTAGGCCA CAAACTTGGT GAATTTGCAC CAACTCGTAC TTACAAAGGT CACGCTGCAG 2760 

ACGACAAGAA AACACGTAGA AAATAAGGAG AACATAAATG GCAGAAATTA CTTCAGCTAA 2820 

AGCAATGGCT CGTACAGTAC GTGTTTCACC TCGTAAATCA CGTCTTGTTC TTGATAACAT 2880 

CCGTGGTAAA AGCGTAGCCG ATGCAATCGC AATCTTGACA TTCACTCCAA ACAAAGCTGC 2940 

TGAAATCATC TTGAAAGTTT TGAACTCAGC TGTAGCTAAC GCTGAAAACA ACTTTGGTTT 3000 

GGATAAAGCT AACTTGGTAG TATCTGAAGC ATTCGCAAAC GAAGGACCAA CTATGAAACG 3060 

TTTCCGTCCA CGTGCGAAAG GTTCAGCTTC ACCAATCAAC AAACGTACAG CTCACATCAC 3120 

TGTAGCTGTT GCAGAAAAAT AAGGAGGTAA AATCGTGGGT CAAAAAGTAC ATCCAATTGG 3180 
TATGCGTGTC GGCATCATCC GTGATTGGGA TGCCAAATGG TATGCTGAAA AAGAATACGC - 3240 
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GGATTACCTT CATGAAGATC TTGCAATCCG TAAATTCGTT CAAAAAGAAC TTGCTGACGC 3300 

AGCAGTTTCA ACTATTGAAA TCGAACGCGC AGTAAACAAA GTTAACGTTT CACTTCACAC 33 60 

TGCTAAACCA GGTATGGTTA TCGGTAAAGG TGGTGCTAAC GTTGATGCaC TCCGTGCAAA 3420 

ACTTAACAAA TTGACTGGAA AACAAGTACA CATCAACATC ATCGAAATCA AACAACCTGA 3480 

TTTGGATGCT CACCTTGTAG GTGAAGGAAT TGCTCGTCAA TTGGAGCAAC GTGTTGCTTT 3540 

CCGTCGTGCA CAAAAACAAG CAATCCAACG TGCAATGCGT GCTGGAGCTA AAGGAATCAA 3600 

AACTCAAGTA TCAGGTCGTT TGAACGGTGC AGATATCGCC CGTGCTGAAG GATACTCTGA 3660 

AGGAACTGTT CCGCTTCACA CACTTCGTGC AGATATCGAT TACGCTTGGG AAGAAGCAGA 3720 

TACTACATAC GGTAAACTTG GTGTTAAAGT ATGGATCTAC CGTGGTGAAG TTCTTCCAGC 3780 

TCGTAAAAAC ACTAAAGGAG GTAAATAACC AATGTTAGTA CCTAAACGTG TTAAACACCG 3840 

TCGTGAGTTC CGTGGAAAAA TGCGCGGTGA AGCAAAAGGT GGAAAAGAAG TAGCATTCGG 3900 

TGAATACGGT CTTCAAGCTA CAACTAGCCA CTGGATCACT AACCGCCAAA TCGAAGCTGC 3960 

TCGTATCGCC ATGACTCGTT ACATGAAACG TGGTGGTAAA GTTTGGATTA AAATCTTCCC 4020 

ACACAAATCA TACACTGCTA AAGCTATCGG TGTGCGTATG GGATCTGGTA AAGGGGCACC 4080 

TGAAGGTTGG GTAGCACCAG TTAAACGTGG TAAAGTGATG TTCGAAATCG CTGGTGTATC 4140 

TGAAGAGATT GCACGTGAAG CGCTTCGACT TGCTAGCCAC AAATTGCCAG TTAAATGTAA 4200 

ATTCGTAAAA CGTGAAGCAG AATAAGGAGA AGGCATGAAA CTTAATGAAG TAAAAGAATT 4260 

TGTTAAAGAA CTTCGTGGTC TTTCTCAAGA AGAACTCGCG AAGCGCGAAA ACGAATTGAA 4320 

AAAAGAATTG TTTGAACTTC GTTTCCAAGC TGCTACTGGT CAATTGGAAC AAACAGCTCG 4380 

CTTGAAAGAA GTTAAAAAAC AAATCGCTCG CATCAAAACA GTTCAATCTG AAGCGAAATA 4440 

ATAGACTAGG GAAGGAGAAA TTTCAATGGA ACGCAATAAT CGTAAAGTTC TTGTTGGACG 4500 

TGTTGTATCT GACAAAATGG ACAAGACAAT CACAGTTGTA GTTGAAACAA AACGTAACCA 4560 

CCCAGTCTAT GGTAAACGTA TTAACTACTC TAAAAAATAC AAAGCTCATG ATGAAAACAA 4620 

TGTTGCCAAA GAAGGCGATA TCGTACGTAT CATGGAAACT CGCCCGCTTT CAGCTACAAA 4680 

ACGTTTCCGT CTTGTAGAAG TTGTTGAAGA AGCGGTCATC ATCTAATCAA ACCTGAAAGG 4740 

AGAAAACTGA AATGATTCAA ACAGAAACTC GTTTGAAAGT CGCAGACAAC AGCGGTGCTC 4800 

GCGAAATCTT GACTATCAAA GTTCTTGGTG GTTCAGGACG TAAATTTGCA AACATCGGTG 4860 

ATGTTATCGT GGCATCTGTA AAACAAGCTA CTCCTGGTGG TGCGGTTAAA AAAGGTGACG 4920 

TTGTTAAAGC AGTTATCGTT CGTACTAAAT CAGGTGCTCG TCGTGCTGAT GGTTCATACA 4980 
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TCAAATTTGA CGAAAACGCA GCAGTTATCA TCCG^AAGA CAAAACTCCT CGCGGAACAC 
GTATCTTTGG CCCA GTO3CA CGTGAATTGC GTGAAGGTGG CTTCATGAAG AttX^c 
TTGCTCCAGA AGTACTTTAA TTTTTAGGAA CAAACTAGTC CCCTAGC^C AAGCTAGGGT 
GCCCTTATGG GCGTAAGAAA AATCAAGGAG AAACCTAATG TTTGTAAAAA AAGGCGACAA 
AGTTCGCGTA ATCQWW* AAGATAAGGG AACAGAAGCT GTTGTCCTTA C^CCCTTCC 
AAAAGTAAAC AAAGTTATCG TTGAAGGTGT TAACATTGTT AAGAAACACC AACGTCCAAC 
TAAOGAGCTT CC^AACG^ GTATCATCGA GAAAGAAGCA GCTATCCACG TATCAAACGT 
-CAAC™ GACAAAAA.G TcGR3TOGGA tacaaatttc TAGACGGTAA 

""WCaC TACAACAAAA AATCAGGCGA AGTGCTTGAT TAATCACGAA GGAAAGGAGA 
AGTATAATGG CAAATCGTTT aaaagaaaaa TATCWA A TC AAGTAGTTCC TGCTTTCACA 

gaacaattca actactcatc ag TCATCGCT aiaeeaMa tagataagat 

A-GC^ GTGAAGCTGT ATCAAACGCT AAAAGCCTTG AAAAAGCTGC TGAAGAA^G 
GCACTTATCT caggtcaaaa ACCACTTATC ACTAAAGCTA AAAAA T CAA T cgccggc TTC 
CGTCTTCGTG AAGGTGTTGC GATCGG^CA AAAGTTACCC TTCGTGGTGA ACGTATGTAC 
GAATTCTTGG ATAAATTGGT A^AGT^A CTTCCACG^ TACGTGACTT CCACGG^C 
CCAACAAAAT CA^ATGG ACGCGGGAAC TACACACTTG GTGTGAAAGA ACAATTAATC 
TTCCCAGAAA TCAACTTCGA TGACGTTGAC AAAACTCGTG GTCTTGACAT CGTTATCGTA 
ACAACTGCTA ACACTGACGA AGAGTCACGT GCATTGCTTA CAGGO^GG AATGCCTTTT 
GCAAAATAAT ATAGGAGGTA AATCTAATGG CTAAAXAATC ■ AATGGTAGCT AGAGAGGCTA 
AACGCCAAAA AATTGTTGAC CGTTATGCTG AAAAACG^C ^TTAAAG GCGGCAGGGG 
actacgaagg CTT a TCT aaa wmcto ACGCCCACC GAC^a CA T AA TC G TO 
GTAGGGTTAC GGGGCGCCCA CATTCAGTTT ACCGCAAATT TGGTCTGAGT CCTATCGCTT 
WOGCGAACT TC CGCA T AAA GGTCAAATTC C^AAC AAAAGCATCT ^GTAAT^A 
AGATATCAAG AGCGTCAAAA CTCCAAGTAA AAATAGGAAA CTTGACGAAG AAACTAAAGT 
TTCPAGGAAA GTTTATCTTT TTCACACAGA G^AGCCCG GGTTCAATTG GGCTTGCCAA 
^CACG AGCTACAGCT TTGGCAAAAA AGACCAATTT GCTTTGGAGC ATTGCTTCTG 
CATTAAATTG TCTATTTTTG CTCGTGCTGT TACGCTCTTT GTATCATGTA TTAACTAGCA 
AGTGCAACTT GCAAACTACT AGTAAGAGGA GAAAAACAAA ATGGTTATGA CTGACCCAAT 
CGCAGACTTC CTAACTCGTA TTCGTAATGC TAACCAAGCT AAACACGAAG TACTTGAAGT 
ACC^CATCA AACATCAAAA AAGGGATTGC TGAAATCCTT AAACGCGAAG GTTTTGTAAA 
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AAACGTTGAA 


ATCATTGAAG 


ATGACAAACA AGGCGTCATC 


CGTGTATTTC 


TTAAATACGG 


6840 


ACCAAATGGT 


GAGAAAGTTA 


TCACTAACTT GAAACGTGTT 


TCTAAACCAG 


GACTTCGTGT 


6900 


CTACAAAAAA 


CGTGAAGACC 


TTCCAAAAGT TCTTAACGGA 


CTTGGAATTG 


CCATCCTTTC 


6960 


AACTTCTGAA 


GGTTTGCTTA 


CTGATAAAGA AGCACGCCAA 


AAGAATGTTG 


GTGGTGAGGT 


7020 


TATCGCTTAC 


GTTTGGTAAA 


ATCAAGATAC AAAGCTCGTA 


AAGAACAAAG 


CAAAATTAGG 


7080 


AAGTTGGAGA 


AGTTTGTTTA 


CAAACAAGCC AACTTATCTA 


TTTTGCACAG 


TTCTTAGAGC 


7140 


GTGTTCAGTT 


CAGCTCTTGA 


ACTAAATAAG TATCTGAACC 


CCGTGAAAAC 


TGGCCGTTCT 


7200 


GGCCTGACAA 


TTTAACAGGA 


GAAAATAAAC ATGTCACGTA 


TTGGTAATAA 


AGTTATCGTG 


7260 


TTGCCTGCTG 


GTGTTGAACT 


CGCTAACAAT GACAACGTTG 


TAACTGTAAA 


AGGATCTAAA 


7320 


GGAGAACTTA 


CTCGTGAGTT 


CTCAAAAGAT ATTGAAATCC 


GTGTGGAAGG 


TACTGAAATA 


7380 


ACTCTTCACC 


GTCCAAACGA TTCAAAAGAA ATGAAAACTA TCCACGGAAC TACTCGTGCC 


7440 


CTTTTGAACA 


ACATGGTTGT 


TGGTGTATCA GAAGGATTCA 


AGAAAGAACT 


TGAAATGCGT 


7500 


GGGGTTGGTT 


ACCGTGCACA 


GCTTCAAGGA TCTAAACTTG 


TTTTGGCTGT 


TGGTAAATCT 


7560 


CATCCAGACG 


AAGTTGAAGC 


TCCAGAAGGA ATTACTTTTG 


AACTTCCAAA 


CCCAACAACA 


7620 


ATCGTTGTTA 


GCGGAATTTC 


AAAAGAAGTA GTTGGTCAAA 


CAGCTGCTTA 


CGTACGTAGC 


7680 


CTTCGTTCAC 


CAGAACCATA 


TAAAGGTAAA GGTATCCGTT 


ACGTTGGTGA 


ATTCGTTCGC 


7740 


CGTAAAGAAG 


GTAAAACAGG 


TAAATAATGT TGAGTGGTTG 


ATCATCAACC 


ACCAACCTAT 


7800 


TTTCCAACTT 


TGTGCATAGC 


ACACGATTTA AAACTAAAGA 


GGTGAAAACT 


GTGATTTCAA 


7860 


AACCAGATAA 


AAACAAACTC 


CGCCAAAAAC GCCACCGTCG 


CGTTCGCGGA 


AAACTCTCTG 


7920 


GAACTGCTGA 


TCGCCCACGT 


TTGAACGTAT TCCGTTCTAA 


TACAGGCATC 


TACGCTCAAG 


7980 


TGATTGATGA 


CGTAGCGGGT 


GTAACGCTCG CAAGTGCTTC 


AACTCTTGAT 


AAAGAAGTTT 


8040 


CAAAAGGAAC 


TAAAACTGAA 


CAAGCCGTTG CTGTCGGTAA 


ACTCGTTGCA 


GAACGTGCAA 


8100 


ACGCTAAAGG 


TATTTCAGAA GTGGTGTTCG ACCGCGGTGG ATATCTATAT CACGGACGTG 


8160 


TGAAAGCTTT 


GGCTGATGCA GCTCGTGAAA ACGGATTGAA ATTCTAATAG GAGGACACTA 


8220 


GAAAATGGCA TTTAAAGACA ATGCAGTTGA ATTAGAAGAA CGCGTAGTTG 


CTGTCAACCG 


8280 


TGTTACAAAA 


GTTGTTAAAG 


GTGGACGTCG TCTTCGTTTC 


GCAGCTCTTG 


TTGTTGTTGG 


8340 


TGACCACAAT 


GGTCGCGTAG 


GATTTGGTAC TGGTAAAGCT 


CAAGAAGTTC 


CAGAAGCAAT 


8400 


CCGTAAAGCA 


GTAGATGATG 


CTAAGAAAAA CTTGATCGAA 


GTTCCTATGG 


TTGGAACAAC 


8460 


AATCCCACAC 


GAAGTTCTTT 


CAGAATTCGG TGGAGCTAAA 


GTATTGTTGA 


AACCTGCTGT 


8520 
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AGAAGGTTCT GGAGTTGCCG CTGGTGGTGC AGTTCGTGCC GTTGTGGAAT TGGCAGGTGT 8580 

GGCAGATATT ACATCTAAAT CACTTGGTTC TAACACTCCA ATCAACATTG TTCGTGCAAC 8640 

TGTTGAAGGT TTGAAACAAT TGAAACGCGC TGAAGAAATT GCTGCCCTTC GTGGTATTTC 8700 

AGTTTCTGAT TTGGCATAAG AAAGGGGATA AAATGGCTCA AATTAAAATT ACTTTGACTA 8760 

AGTCTCCAAT CGGACGCATT CCATCACAAC GTAAAACTGT TGTAGCACTT GGACTTGGCA 8820 

AATTGAACAG CTCTGTTATT AAAGAAGATA ACGCTGCTAT CCGTGGTATG ATCACAGCAG 8880 

TATCTCACTT AGTAACAGTT GAAGAAGTAA ACTAATGAaG TTTTAGGGGA TGTGCACTGT 8940 

ACCATCCCCT AAAACTAGAT ATAGTCATCT ATGATGACAT CGTATAGGCG AGTTGATGGG 9000 

GGAGACAACC TTTTCTCCCT TATCGGCGCT AGCATTTTAC AAAAGAGGAG AAAATAAAAA 9060 

TGAAACTTCA TGAATTGAAA CCTGCAGAAG GTTCTCGTAA AGTACGTAAC CGCGTTGGTC 9120 

GTGGTACTTC ATCAGGTAAC GGTAAAACAT CTGGTCGTGG TCAAAAAGGT CAAAAAGCTC 9180 

GTAGCGGTGG CGGAGTTCGC CTTGGTTTTG AAGGTGGACA AACTCCATTG TTCCGTCGTC 9240 

TTCCAAAACG TGGATTCACT AACATCAACG CTAAAGAATA CGCAATTGTG AACCTTGACC 9300 

AATTGAACGT CTTTGAAGAT GGTGCTGAAG TAACTCCAGT TGTTCTTATC GAAGCAGGAA 9360 

TTGTTAAAGC TGAAAAGTCA GGTATTAAAA TTCTTGGTAA CGGTGAGTTG ACTAAGAAAT 9420 

TGACTGTGAA AGCAGCTAAA TTCTCTAAAT CAGCTGAAGA AGCTATCACT GCTAAAGGTG 9480 

GTTCAGTAGA AGTCATCTAA GAGAGGTGAC CTATGTTTTT TAAATTATTA AGAGAAGCTC 9540 

TTAAAGTCAA GCAGGTTCGA TCAAAAATTT TATTTACAAT TTTTATCGTT TTGGTCTTTC " 9600 

GTATCGGAAC TAGCATTACA GTTCCTGGTG TGAATGCCAA TAGCTTGAAT G CTTT AAGTG 9660 

GATTATCCTT CTTAAACATG TTGAGCTTGG TGTCGGGGAA TGCCCTAAAA AACTTTTCGA 9720 

TTTTTGCCCT AGGAGTTAGT CCCTATATCA CCGCTTCTAT TGTTGTCCAA CTCTTGCAAA 9780 

TGGATATTTT ACCCAAGTTT GTAGAGTGGG GTAAACAAGG GGAAGTAGGT CGAAGAAAAT 9840 

TGAATCAAGC TACTCGTTAT ATTGCTCTAG TTCTCGCTTT TGTGCAATCT ATCGGGATTA 9900 

CAGCTGGTTT TAATACCTTG GCTGGAGCTC AATTGATTAA AACTGCTTTA ACTCCACAAG 9960 

TTTTTCTGAC GATTGGTATC ATCTTAACAG CTGGTAGTAT GATTGTCACT TGGTTGGGTG 10020 

AGCAAATTAC AGATAAGGGA TACGGAAACG GTGTTTCCAT GATTATCTTT GCCGGGATTG 10080 

TTTCCTCAAT TCCAGAGATG ATTCAGGGCA TCTATGTGGA CTACTTTGTG AACGTCCCAA 10140 

GTAGCCGTAT CACTTCATCT ATCATTTTCG TAATCATTTT GATTATTACT GTATTGTTGA 10200 

TTATTTACTT TACAACTTAT GTTCAACAAG CAGAATACAA AATTCCAATC CAATATACTA 10260 

AGGTTGCACA AGGTGCTCCA TCTAGCTCTT ACCTTCCGTT AAAAGTAAAC CCTGCTGGAG 10320 
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TTATCCCTGT 


TATCTTTGCC AGTTCGATTA CTGCAGCcTG CGGCTATTCT TCAGTTTTTG 


10380 


AGTGCCACAG 


GTCATGATTG GGCTTGGGTA AGGGTAGCAC AAGAGATGTT GGCAACTACT 


10440 


TCTCCAACTG 


GTATTGCCAT GTATGCTTTG TTGATTATTC TCTTTACATT CTTCTATACG 


10500 


TTTGTACAGA 


TTAATCCTGA AAAAGCAGCA GAGAkCCTAC AAAAGAGTGG TGCCTATATC 


10560 


CATGGAGTTC 


GTCCTGGTAA AGGTACAGAA GAATATATGT CTAAACTTCT TCGTCGTCTT 


10620 


GCAACTGTTG 


GTTCCCTCTT CCTTGGTGTG ATTTCCATTT TACCGATTGC AGCTAAAGAT 


10680 


GTATTTGGTC 


TTTCTGATGT TGTTGCCTTT GGTGGAACAA GTCTCTTGAT CATTATCTCT 


10740 


ACAGGTATCG 


AAGGAATCAA GCAATTGGAA GGTTACCTAT TGAAACGTAA GTATGTTGGT 


10800 


TTCATGGACA GAACAGAATA AAAGTATTTA CTGAATCAGT AAATACTGAG GGAGTGGAGG 


10860 


TTTAAACTCT 


GACATTTGTA AGAGTTGGAT CTCCCCTCTT CTATTTTGTT TTTAAATCGG 


10920 


GGTGAAAAGA 


CTTTTTGCTT CTATTTAAAA ATAAAATAAG GAGATCAAAT CATGAATCTT 


10980 


TTGATTATGG 


GCTTACCTGG TGCAGGTAAG GGAACTCAAG CAGCAAAAAT CGTAGAACAA 


11040 


TTCCATGTTG 


CACATATCTC AACAGGTGAT ATGTTCCGCG CTGCAATGGC AAATCAAACT 


11100 


GAAATGGGTG 


TTCTTGCTAA GTCATATATT GACAAGGGTG AATTGGTTCC TGACGAAGTT 


11160 


ACAAATGGAA 


TCGTAAAAGA ACGCCTTTCA CAAGATGATA TTAAAGAAAC AGGATTCTTA 


11220 


TTGGATGGTT 


ACCCACGTAC AATTGAACAA GCTCATGCCT TGGACAAAAC ATTGGCTGAA 


11280 


CTTGGCATTG AACTAGAAGG TGTTATCAAT ATTGAAGTGA ACCCTGACAG CCTTTTGGAA 


11340 


CGTTTGAGTG 


GGCGTATCAT CCACCGCGTA ACTGGAGAAA CTTTCCACAA GGTCTTTAAC 


11400 


CCACCAGTTG 


ACTATAAAGA AGAAGATTAC TACCAACGTG AAGATGATAA GCCTGAGACA 


11460 


GTAAAACGTC 


GTTTGGATGT TAATATTGCT CAAGGAGAAC CAATCATTGC TCACTACCGT 


11520 


GCCAAAGGTT 


TGGTTCATGA CATCGAAGGT AATCAAGATA TCAATGATGT CTTCTCAGAT 


11580 


ATTGAAAAAG 


TATTGACAAA TTTGAAATAA AGCGTTTTTC ACACTTGCAA AAATCCGCTA 


11640 


CAAATGTTAT 


ACTGAGATAG TCTGACTTAT AATTGTTGTC TCTGTGTCTA GAGGCATCGA 


11700 


ATCGAAATTT 


ATGGAGGTGC TTTTGCGTGG CAAAAGACGA TGTGATTGAA GTTGAAGGCA 


11760 


AAGTAGTTGA 


TACAATGCCG AATGCAATGT TTACGGTTGA ACTTGAAAAT GGACATCAGA 


11820 


TTTTAGCAGG 


G 


11831 


(2) INFORMATION FOR SEQ ID NO: 66: 





(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10726 base pairs 
{B> TYPE: nucleic acid 
{C) STRANDEDNESS : double 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 


66: 






CCCGGCATTT GAAAGCTATT CGTGAAGGAT TTATGATGGC 


AATGCCTTTG 


ATTTTAGTCG 


60 


GCTCTTTATT TCTTATTCTA ATCAGTTGGC CTCAAGAGGC 


TTTTACAAAT 


TGGCTGAATA 


120 


GTGTTGGATT GCTAAGTATC TTGACAACTA TGAATCAGTC 


AACAGTAGCG 


ATTATCTCCT 


180 


TGGTCGCTTG TTTCGGTATT GCCTACAGGT TGTCGGAAGG 


ATATGGTACA 


GATGGTCCGT 


240 


CGGCAGGGAT CATAGCCTTA TCCAGTTTTG TATTGATGGC 


ACCTCGTTTT 


TCGAGTATGG 


300 


TTTATGATAA AAATGGGGAG CAGGTCAAGC AGTTATTTGG 


CGGCGCAATA 


CCATTTTCTA 


360 


GCCTGAATGC ATCTTCTTTG TTTATGGCGA TTACTATTGG 


ATTGGTTACA 


GCAGAGATTT 


420 


ATCGTATGTT TATCCAGCGC GGAATTACGA TAAAAATGCC 


AAGTGGTGTC 


CCAGATGTAG 


480 


TAAGTAAATC ATTTTCAGCT CTTTTATCTG GTTTTACTAC 


TTTTGTTTTG 


TGGGCTTTGG 


540 


TCTTAAAAGG TCTTGAAGCG GCAGGAGTTG CAGGAGGTCT 


CAACGGACTC 


CTAGGTGCAA 


600 


TTGTTGGAAC ACCGCTTAAG TTAATTGCAG GAACGCTTCC 


AGGTATGATT 


CTATGTGTTA 


660 


TTGTAAACTC ATTCTTTTGG TTCTGTGGAG TTAATGGGGG 


ACAAGTTTTA 


AATGCTTTTG 


720 


TAGACCCAGT TTGGTTACAA TTTACTACAG AAAACCAAGA 


AGCTGTGGCT 


GCAGGACAAA 


780 


CACTCCAACA CATTATTACA TTACCGTTTA AAGATTTATT 


TGTATTTATT 


GGTGGCGGTG 


840 


GAGCGACTAT TGGTCTTGCG ATTTGTCTCT TCCTATTTAG 


TAAGAGTCGT 


GCGAATAAAA 


900 


CATTAGGTAA GCTAGCTATT ATACCGTCTA TTTTTAATAT 


CAATACAGCT 


ATTCTATTTA 


960 


CGTTTCCAAC AGTTTTAAAT CCGATTATGC TGATTCCGTT 


TATTGCTACT 


CCTACAATCA 


1020 


ATGCCTTGAT TACCTATGTA TCAATGGCTG TAGGATTAGT 


ACCCTATACA 


ACAGGTGTAA 


1080 


TCCTTCCGTG GACAATGCCA CCGATTATAG GAGGCTTCCT 


TGCAACAGGG 


GCTAGTTGGC 


1140 


GAGGAGCTCT ATTACAAGTT GTTTTGATTT TGGTTTCTGT 


AGCAATTTAT 


TATCCATTCT 


1200 


TCAAAATTGC AGATAAACGC AATCTTGAAA AAGAAAAAGC TACTGTTGGA GGGAAATAAG 


1260 


ATGGTTATCA GAGTATTTGA TCAACAGAAA AATACTTATT 


CTAGCTTTGC 


CTTAGAGGAA 


1320 


TTAAGTTACT ATATGAATCG GGTCTTTAAG ACTAACATAG 


AGCTTGTCGA 


GGAGAAGGAA 


1380 


GCGGATATTT TTGTAGGATT AGTCAATAAA GAGGACAGAA 


AAGACCATGT 


TCTTATCTCA " 


1440 


TTAGACAAGG GTAAGGGGAG AATTGAGTCT AATACAATTG 


TAGGTTTACT 


TATTGGAATT 


1500 


TACCGAATGT TTCATGAATT TGGGGTTGTG TATACTAGAC 


CAGGGCGCAG 


ACATGACTTT 


1560 


GTTCCAGAGT TACGATTTGA AGATTTTTTA GATAAACAGC 


TATCTATAGA 


TGAAACAGCC 


1620 
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AGTTACTATC ATAGGGGAGT ATGTATAGAG GGAGCGGATT CATTTGAAAA TATACTAGAT 1680 

TTCATTGATT GGCTACCTAA GATTGGGATG AACAGTTTTT TCATCCAGTT TGAAAATCCT 1740 

TACTCTTTTT TGAAACGTTG GTATGAACAT GAATTTAATC CATATCTAAA TAAAGAACAA 1800 

TTTTCAAATG AATTAGTACA AGAATTGAGT GATAGGTTGG ATAAAGAATT GCAAAAAAGA 1860 

GGTCTTATTC ATCATCGTGT TGGTCATGGA TGGACAGGTG AAGTTTTAGG TTACTCTTCA 1920 

AAATTTGGCT GGGAATCAGG TCTTAGTATT TCAGAGGAGA AGAAACCCTA TGTCGCTGAA 1980 

ATAAACGGGA AACGAGAATT GTTTAATACG GCTCCGATTT TAACCAGCCT GGATTTTTCA 2040 

AATCCAGATG TAGCTGATAA GATGGTAGAA ATTATCAAGG ATTATGCCAA GAAAAGACCT 2100 

GATGTTAACT ACTTACATGT ATGGTTGTCG GATGCTCGTA ATAATATTTG TGAATGCGAA 2160 

AACTGTAGAC AAGAATTGGT TTCGGATCAG TATATTCGTA TTCTCAATCA ATTGGATAGG 2220 

GCTTTAACGA GTGAGGGATT AGATACAAAG ATTTGTTTTC TGCTTTATCA TGAGTTGTTA 2280 

TGGGCACCTC AGAAAGAAAA ATTAGATAAT CCTGAACGCT TTACCATGAT GTTTGCACCG 2340 

ATTACAAGAA CATTTGAAAT GAGTTATGCA GATGTAGATT TTGACAATTC CATACCTACG 2400 

CCTAAACCTT ATATGCGTAA TAAAATTATA CTTCCGAATT CTCTTGAGGA AAATTTATCT 2460 

TATCTTTTTG AGTGGCAAAA AGCATTTAAA GGAGATAGTT TCGTATATGA CTATCCTTTA 2520 

GGGCGTGCTC ATTATGGCGA TTTAGGCTAT ATGAAAATTA GTCAAACTAT TTACAGAGAT 2580 

GTATCTTATC TTTCCAACCT ACATTTGAAC GGGTACATTT CGTGTCAAGA ATTACGTGCC 2640 

GGATTCCCTC ATAATTTTCC TAATTATGTC ATGGGGGAAA TGCTCTGGAA GAAGACAAGA 2700 

AGTTATGAAG AATTGATTGA AGAATACTTT TCTGCTTTGT ATGGGGAAAA TTGGCAGTCT 27 60 

GTTGTTGAAT ATTTAGAAAA ATTATCCATT TATTCCTCTT GTGATTATTT TAATGCAATT 2820 

GGCAGCCGTC AAAGTGATGT TTTAGCGAAT CATTATTATA TAGCTTACAA TCTAGCTGAT 2880 

AATTTTTTAC CAATTATTGA GGAAAATATT TCTAAGTTAT TAAATAGTCA AAAGGATGAA 2940 

TGGAAACAGC TCAGTTATCA TCGTGAATAT GTTGTTAAGA TGGCGAAGGC TTTATATCTT 3000 

CAAGCAACTG GAAAAACAAG GCAAGCTCAA GATGAATGGA GAAATGTGTT GAATTATATC 3060 

CGTGGGCACG AATTGCTATT TCAATCTAAT TTGGATGTTT ATCGTGTAAT TGAAGTAGCA 3120 

AAAAATTACG CTGGTTTCCA CTTATAAATC ATAAGTATAG AAAATGAACT AAGGTATTCA 3180 

GAGAAGATTG ATCCTAAATA TTATGAAATT TAAGGATTTT TAAGATATTT AGGGTCAACT 3240 

TTCTATTTAT ATCGTAGCGA AGTCATTTTA ATAATGATGT GTAAAAGATG GATCAAGATT 3300 

GAGGAGGAAG AAAGATGAAA TCAAAAGAAG AAATAAATAT GCTTGGTTTT ACAATTGTCG 3360 
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CTTACGCAGG 


AGATGCAAGG TCAGATTTGA 


TGGATGCTTT GGCGTTTGCG 


AGAGATGGAT 


3420 


ATTTTGAACA 


GGCAAGAGAA TTGGTTGAGT 


CTGCAAACGA CTCAATAGTG 


TCTGCCCATC 


3480 


GAGAACAGAC 


TAATTTATTA GCGGAGGAGG 


CATATGGAGA 


TAATTTTGAA 


GTGAGCTTTA 


3540 


TTATGATTCA 


TGGTCAAGAT ACTTTGATGA 


CAACGATGCT 


ATTGTATGAT 


CAGGTAAAGT 


3600 


TTTTTATTGA 


TGAATATGAA CGAATTCGAA 


AGATTGAAGA 


ACATATTGGT 


TTGCAATGAG 


3660 


GATTAGTCAT 


GGAAAATTTA CAGGTTAAAG 


CCTTACCGAA 


GGAGTTTTTA 


TTAGGAACTG 


3720 


CTACCGCTGC 


TTATCAAGTA GAGGGTGCAA 


CTAGGGTAGA 


TGGCAAAGGA 


ATAAATATGT 


3780 


GGGATGTTTA 


TTTGCAAGAA AATAGTCCGT 


TCTTACCAGA 


TCCAGCTAGT 


GATTTTTATT 


3840 


ATCGTTACGA 


AGAGGATATA GCTTTGGCGG 


CAGAACATGG 


TTTGCAGGCT 


TTGCGTTTAT 


3900 


CTATTTCTTG 


GGTTCGTATA TTTCCTGATA TAGATGGGGA TGCTAATGTA TTAGCTGTTC 


3960 


ATTATTACCA 


TAGAGTTTTT CAGTCTTGCT 


TAAAACATAA 


TGTGATTCCG 


TTTGTTTCTT 


4020 


TACATCATTT 


TGATTCGCCT CAGAAAATGT 


TAGAAACAGG 


GGATTGGTTG 


AACAGAGAGA 


4080 


ATATTGATCG 


TTTCATACGA TATGCTCGCT 


TTTGTTTCCA 


AGAATTTACA 


GAAGTCAAGC 


4140 


ATTGGTTTAC 


AATCAATGAA CTGATGTCTC 


TTGCTGCAGG 


TCAATATATA 


GGAGGTCAGT 


4200 


TTCCTCCAAA 


TCATCATTTT CAATTATCTG 


AAGCAATTCA 


AGCGAATCAT 


AATATGTTGT 


4260 


TGGCGCATGC 


TCTTGCAGTC CTCGAATTTC 


ATCAATTAGG 


GATTGAGGGA 


AAGGTAGGTT 


4320 


GTATTCATGC 


TTTAAAGCCA GGCTATCCTA 


TTGATGGGCA 


AAAAGAAAAT 


ATTTTGGCAG 


4380 


CTAAACGGTA 


TGATGTTTAT AATAATAAAT 


TTCTATTAGA 


TGGAACTTTT 


TTGGGCTACT 


4440 


ACAGTGAGGA 


CACGCTTTTT CACTTGAATC 


AAATATTGGA 


AGCTAATAAT 


TCTAGCTTTA 


4500 


TTATTGAAGA 


TGGTGATTTA GAAATTATGA 


AGAGAGCTGC 


ACCTCTTAAT 


ACGATGTTTG 


4560 


GGATGAATTA TTATCGTTCA GAATTTATTC GTGAATACAA AGGTGAAAAT 


AGACAAGAAT 


4620 


TTAATTCAAC 


AGGAATAAAA GGACAGTCTT 


CTTTTAAATT 


AAATGCTCTA 


GGTGAATTTG 


4680 


TAAAAAAACC 


TGGTATTCCG ACAACAGATT 


GGGATTGGAA 


TATTTATCCT 


CAAGGGTTAT 


4740 


TTGATATGTT 


GCTTCGTATC AAAGAAGAAT 


ATCCTCAACA 


TCCGGTCATT 


TATTTAACTG 


4800 


AAAATGGTAC AGCCCTTAAA GAAGTTAAGC CAGAGGGCGA 


GAATGATATT 


ATTGATGACA 


4860 


GTAAGAGAAT 


CCGTTATATT GAGCAACATT 


TACACAAAGT 


TTTAGAGGCT 


CGAGATAGAG 


4920 


GAGTCAATAT 


TCAAGGCTAT TTTATATGGT 


CTTTGCAAGA 


TCAATTTTCT 


TGGGCGAATG 


4980 


GCTACAATAA 


GCGATATGGT CTTTTCTTTG 


TTGATTATGA 


AACACAGAAG 


AGATATATTA 


5040 


AGAAAAGTGC 


TCTTTGGGTA AAAGGGCTAA 


AACGGAATTA 


AGGTTAGCGA 


TTTGACTGAT 


5100 


GTTTAATATG 


TTTTAAATAT GAGGTTGAAT 


TTTTTATAGG 


AGGAGTTTTA 


TGGATAAGCT 


5160 
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AGTCGCTGCC 


ATTGAAAAGC 


AACAAGGGAA 


ATTTGAAAAA 


ATTTCTACTA 


ATAACTATAT 


5220 


GATGGCTATT 


AAAGATGGAT 


TCATTGCTAC 


TATGCCTTTA 


ATTATGTTTT 


CAAGCTTTTT 


5280 


GATGATTATT 


ATTATGATTC 


CTAAAAATTT 


CGGAGTAGAG 


TTACCGAGTC 


CAGCTATTGT 


5340 


CTGGATGAGA 


AAAGTGTATA 


TGTTAACCAT 


GGGAGTTTTG 


GGTATTATTG 


TTTCAGGGAC 


5400 


TGTTGGAAAG 


TCATTAGTTG 


GAAATGTTAA 


CAGAAAAATG 


CCTCACGGAA 


AGGTAATAAA 


5460 


TGATATTTCT 


GCAATGTTGG 


CAGCCATATG 


TAGTTATCTG 


GTATTAACTG 


TAACGCTTGT 


5520 


AGTTGATGAG 


AAGACGGGAT 


CTACAAGTTT 


GTCGACAAAC 


TATTTAGGAT 


CTCAAGGATT 


5580 


GATAACTTCG 


TTTGTCAGTG 


CCTTTATTAC 


TGTAAATGTT 


TACCGATTCT 


GTATTAAGCG 


5640 


AGACATTACT 


ATTCATTTAC 


CTAAGGAAGT 


TCCTGGGGCT 


ATATCACAAG 


CTTTTAGAGA 


5700 


TATTTTCCCT 


TTTTCTTTTG 


TTTTACTTAT 


TAGTGGTTTG 


TTAGATATTG 


TATCTCGGTT 


5760 


TAGTTTAGAT 


GTTCCTTTTG 


CCCAAGTATT 


TCAACAACTA 


TTGACTCCTA 


TTTTTAAGGG 


5820 


GGCAGAATCA 


TATCCTGCTA 


TGATGTTGAT 


TTGGTTTATG 


TGTGCTTTGC 


TTTGGTTTGT 


5880 


TGGAATTCAT 


GGACCATCTA TTGTCTTACC 


TGCTGTTACA 


GCTTTGCAAC 


TGAGCAATAT 


5940 


GGAAGAGAAT 


GCTCAACTTC 


TTGCAAATGG GCAGTTCCCT 


TATCATTCTT 


TAACACCTAA 


6000 


TTTCGGGAAT 


TATATCGCTG 


CTATTGGAGG 


AACGGGGGCT 


ACCTTTGTTG 


TACCATTTAT 


6060 


TTTGATTTTC 


TTTATGCGGT 


CTAAACAATT 


AAAATCGGTA 


GGTAAAGCTA 


CAATTACTCC 


6120 


TGTTTTATTT 


GCGGTAAATG 


AACCTCTTCT 


ATTTGGTATG 


CCTGTTATTT 


TGAATCCCTA 


6180 


TCTTTTTGTC 


CCTTTTTTGA TGACTCCACC AGTGAATGTA 


TTTCTAGGAA 


AGGTCTTTAT 


6240 


TGATTTCTTT 


GGAATGAATG 


GATTTTATAT 


CCAGTTACCT 


TGGACCTTTC 


CTGGTCCCTT 


6300 


GGGATTGTTA 


ATTGGAACGA ATTTTCAACT 


TATCTCCTTT 


GTATTTTTAT 


CTTTGATTTT 


6360 


AGTTGTCGAC 


ATATTGATTT 


ATTTGCCATT 


CTGTAGAGCG 




Ab I 1 At- 1 o(j 1 


6420 


GAAAGAAGAT 


ATTGCAAGCT 


CAAATGATAT 


TATTTTAGAG 


GAGGATACAA 


GTGAAATAAT 


6480 


TCCTGGTGAG 


ATAGATGAAA 


TAAAAAGTAA 


GGAGTTGAAA 


GTACTGGTTC 


TTTGTGCAGG 


6540 


GTCTGGAACA 


AGTGCGCAAT 


TAGCCAATGC 


AATTAACGAG 


GGGGCTAACT 


TAACAGAGGT 


6600 


TAGAGTGATT 


GCGAATTCAG 


GAGCGTACGG 


AGCTCATTAT 


GATATTATGG 


GTGTTTATGA 


6660 


TTTAATTATT 


CTGGCCCCAC 


AAGTTCGGAG 


TTATTATAGA 


GAGATGAAGG 


TGGATGCAGA 


6720 


AAGATTAGGT 


ATTCAGATAG 


TTGCTACCAG 


AGGAATGGAA 


TATATTCATT 


TAACAAAGAG 


6780 


TCCAAGTAAA 


GCCTTACAAT 


TTGTATTGGA 


GCATTACCAA 


GCTGTGTAGT 


AAGTTTTTCC 


6840 


ATCTTTTATT 


TGAGTAAAGA 


TTTTGTTTAC 


AGATAGGCTT 


GGATTTAAAA 


ACGTTCCCCC 


6900 
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TTTTTTAATA 


TAAGAATCCC 


TCTTTCACAA 


TTGTAAAAAG 


AGGGATTTTG 


TATTTTATCT 


6960 


CTTAGACCAA 


GTTCTCTTCA 


TAAAGAGAAG 


GAGGATTGGG 


TAAATCTCCA 


AGCGCCCTGC 


7020 


AATCATTGCA 


AAGGATAGGA GAATTTTTGA 


GATGGGACTA 


AAGATTGAGA 


AACTAGAAGT 


7080 


GGTTCCTAGA 


ATAGGCCCGA TATTATTGAA ACAGCTAAAG 


AC AG CGCTGG 


TCACGACCAG 


7140 


AAAATCATTG 


CTATCTAGGC 


TGACAATAAA 


GATAAGCGCT 


AGCAAAATCA 


TAGCATAGAT 


7200 


GACAAAGTAC 


TTGAGAATCT 


TATGCTGGGT 


ATCTTTGTCA 


ATCACCGTTT 


TATTAACATG 


7260 


GAGGGTCAAA 


ACACGGTGGG 


GCGATAGGAT 


TGACAAAATT 


TGGTTTTTGG 


CAATTTTTGA 


7320 


AAGGATGAGG 


CCTCGAATAA 


TCTTGAGTCC 


ACCTGCAGTT 


GATCCAGCAG 


AGCCACCGAT 


7380 


TGCCATGAGG 


AAAAGGAGGA 


TAAACTGGGA 


GAAGAGGGGC 


CAGTTGGTAA 


TATCTCCATA 


7440 


TCCAAAACCA 


GTTGTTGTAA 


TGATGTTGGA 


AACCTGGAAG 


AAGGTCATTT 


CAAAGCTCTT 


7500 


TGAAAACCCT 


GGGTAGAGGT 


AGAGGGTGTT 


GAGGCTAATC 


AAGCCTGTAG 


AAACCAGTAC 


7560 


AATGACCAAG 


TAAGCCCTAA 


GCTCTTCATC 


TCCAAAGAAG 


GCCTTGATGC 


GACGGAGCAT 


7620 


GAGGTAGTAG 


TAGAGGTTGA 


AATTTACTCC 


AAAAACCAGA 


ACTCCGATAC 


TGACCAGATA 


7680 


GGTAATCAGT 


GAGCTGCCAT 


AGTGGGCAAT 


TCCGTCGTTA 


TAGACGGTAA 


AGCCTCCAGT 


7740 


TCCCGCTGTC 


CCCATAGCAA 


TAACAAAACT 


ATCGTAGAGA 


GGCATACCGG 


CTAGATAATA 


7800 


GATGATGACA 


AAGAGGGAGA 


AGAGAGCTAG 


ATAAAGGAGA 


TAGAGAATCT 


GGGCAGTGTT 


7860 


TTTTAGTTTG 


GATACAACCT 


TGCCAAAAAC 


AGGACCTGGA 


ACCTCAGCCT 


TCATCACCTC 


7920 


TAGGTGGCTA 


TTTTTGGCAT 


TGTCCATAAT 


AGCAAGTGCA 


AAAACAAGCA 


CTCCCATCCC 


7980 


TCCAATCAAG 


TGGGTAAAAC 


TTCGCCAGAA 


GAGGAGGGAA 


CGGCTGAGAA 


CCGAAACGTC 


8040 


GTTCAAAATA 


CTTGCTCCAG 


TAGTTGTAAA 


TCCAGAACTA 


ATTTCAAAAA 


AGGCATCAAT 


8100 


AAGGCTGGGG 


ATTTGCCCAG 


AAAAGACAAA 


GGGGAGACCA 


CCAAAGAAAG 


ACCAAAGGAT 


8160 


CCAACAGAGG 


GCAACGATCA 


AGACTCCCTC 


CTTGGCATAA 


ATCCGTTGAT 


TTTTTGGCTT 


8220 


CTGTAAACTC 


CCTGAACCGC 


CTAACAATAC 


GAGAATCCCT 


ATGGTCGAAA 


AGAGGGCTGT 


8280 


AAAGACTTGG 


CTCGATTCAC 


GGTAATAGAC 


AGCAATCGCA 


ACAGGAACCA 


AAAGAAGAAC 


8340 


AGCTTCAATC 


AAAAGTAATT 


TTGAAAGGAG 


GTAACGAATC 


ATACTTTTAT 


TCATTTCTTA 


8400 


CCTCGCGATC 


AAGTCATAAA 


TCTTGGTGAT 


GTTTGGCAAC 


AAGGTTGTTA 


CTAGGAGCTT 


8460 


GTCTCCAACT 


TCCAACATAT 


CCTCCCCAGT 


TGGGAAAATA 


GTCTTGCCCT 


TTCGAATAAT 


8520 


GGCTGCAATA AGAACCCCTT 


TTTTCAATTT 


CAGTTGAGAA 


AGAGGTTTGG 


CAGTCATTTT 


8580 


ATTGGCTTCC 


TTGATATGGA 


ATTGCAGGGT 


TTCGATTTGG 


CCATTGGCTA 


GATGGTGCAT 


8640 


AGCTTGAAGG 


TCTGAATACT 


GGGCATTAAC 


TCGACCACGA 


ATAAAGTGCA 


TAATCGTATC 


8700 
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TACAGCGATG CTTTTAGGTG TGATGATACT TGAAAAATCA GGCGCATTGA TAATCTCGAG 8760 

GAGACTGGTA CGATTGACCT TAGTAATATT TTTCTGTACA CCTACCCTGT CAAGGAACAT 8820 

AGATGTAATC AGATTTTCCT CATCGACTCC TGTTAGAGTC GCAACGGCAT CATAGTGTTG 8880 

AGCACTTTCT TCCAGCAGGA TATCTTTTGC GGTTCCATCT CCTTGAACGA TGTAGAGATT 8940 

TGGGAATTTC TCGCTAAAGA AGCTGGCGAT TTCAGGATTG ATTTCAATGA CTTTTGTATC 9000 

GATACGACTA TCTTTGAGAA TACCAAGTAG ATAATAGGCA ATTCTACCTG CCCCAACGAT 9060 

GAGAAGGCTC TTCACGGCGC GTGATTTAAA ATAATTATGG AAGAGTATCA TATCGACACG 9120 

GTTACCAGTG ACAAAGATTC TATCTTTATC CTGTACAGTC ATGTCACCGC TTGGAATGAT 9180 

AATTTGATGA TCCCTCTCTA TCGCACAGAC AATGACATTA CCAAATTTTT TACGAAAATC 9240 

AGAAATGGGC ATTTGGCAAA GACCGCTGGT GGACTTGACG ACAAATTCCA TGAGGCTAAC 9300 

GCGTCCACCA GCAAAGCGTT CGACAGACAG GGCGTTGGGG AAGTCAATGA TATTCGCGAT 9360 

AGCGCGGGCA GCCAAGAGCT CAGGATTAAC GATAAGAGAA AAACCGAGAA TATTCTTTTC 9420 

CTTGAAATAA GAGTTAGAAT ATTCAGGGTT CCGCACCCGA ACGATAGTTT CTTTAGCTCC 9480 

CATTTTCTTG GCTAGAACTG CTGCAATCAT GTTGACTTCA TCGTGCTCAG TCAGGGCGAT 9540 

AAAGATATCA CAATCTTGGA CGCTGGCTTG CTCAAGAATG GCAAAATCGG CCCCGTTACC 9600 

AAGGATACCA ATGATATCAA AGCGACTGAC AATATGATTG AGAACAGCTT CGTCTTGCTC 9660 

AATCAGCAAA ACATCATGCT TTTCTGCAAC CAAGGAGCGA CAGAGGGCAA AACCAACTTT 9720 

TCCCCCTCCG ACAAGGATAA TTTTCATAAT AAAACCTACT TTTTCATGAT GTAACTATCA 9780 

TACCCTTTTT CAAGAAAAAA TGCACCTACT AGCTAATAAC AAGAGTTTTT AGTGAAAATT 9840 

CGCTATAAGG TAAAACTATA CCCTAACCAA TTGAAATAGC TATTAGCGAC TTTCTCTGAA 9900 

ATATGGTATG ATAAAGGATA TACAAGGAGA TAAAATGAAT AATAATTTAC TGGTATTACA 9960 

ATCAGACTTT GGTCTGGTTG ATGGTGCGGT ATCGGCTATG ATTGGAGTGG CTTTAGAAGA 10020 

GTCTCCAACC TTAAAAATAC ATCACTTGAC GCACGATATC ACGCCTTATA ATATTTTTGA 10080 

GGGGAGCTAT CGTCTCTTTC AGACGGTGGA TTACTGGCCT GAGGGAACGA CGTTTGTATC 10140 

GGTTGTCGAT CCAGGTGTCG GTTCGAAACG TAAGAGTGTA GTTGCCAAGA CTGCAAAAAA 10200 

TCAATACATT GTCACGCCAG ATAATGGGAC GCTTTCCTTT ATCAAGAAAC ACGTTGGCAT 10260 

TGTAGCCATT CGTGAGATTT CTGAGGTGGC CAATAGGCGT CAAAACACAG AGCATTCTTA 10320 

TACCTTCCAC GGTCGTGATG TCTATGCCTA TACTGGTGCT AAACTGGCCA GTGGTCACAT 10380 

TACTTTTGAG GAAGTAGGGC CAGAGCTCAG TGTGGAACAG ATTGTAGAGC TTCCAGTCGT 10440 
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AG CG AC CATC ATAGAAGATC ATCTGGTGAA GGGAGCCATT GATATTCTGG ATGTGCGTTT 10500 

CGGTTCGCTT TGGACCTCTA TCACACGGGA AGAATTTTAC AAGCTGGAAC CAGAATTTGG 10560 

TGATCGTTTT GAAGTGACCA TCTATCATGC TGATATGCTG GTCTATCAAA ATCAGGTTGT 10620 

CTATGGCAAA TCATTTGCAG ATGTGAGAAT TGGGCAACCs ATcTTTACrc TCAGCaTCTt 10680 

CGATTAGCTG GGCAATTCGT TCTAGTTGGA TTTCGTCAAT CAAGGT 10726 
(2) INFORMATION FOR SEQ ID NO: 67: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7163 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

TTATCTTTAA CGATATCAAT CAAGATCTGG TCAATAAAGG GATTGGGGCT TATCGTGAAG 60 

TTGGCATCCA AGCCCATGGA TATGTCTGTG ACGTGACAGA CGAGGACGGT ATCCAAGCCA 120 

TGGTCAAGCA AATCGAACAA GAGGTTGGTG TCATTGACAT CCTCGTTAAT AACGCTGGTA 180 

TTATCCGCCG AGTTCCAATG TGCGAAATGA GCGCCGCTGA TTTCCGTAAG GTCATCGATA 240 

TTGACTTAAA CGCACCATTT ATCGTTTCAA AGGCAGTTAT TCCTTCTATG ATAAAGAAAG 300 

GGCATGGAAA GATTATCAAT ATTTGTTCGA TGATGAGCGA ACTGGGACGT GAAACAGTTA 360 

GCGCTTATGC TGCTGCTAAA GGGGGCTTGA AAATGTTGAC CCGCAACATT GCGTCTGAAT 420 

ACGGTGGAGC CAATATCCAA TGTAACGGAA TTGGACCGGG TTATATTGCC ACTCCTCAAA 480 

CAGCACCTCT TCGTGAATTG CAAGAAGATG GTTCTCGCCA CCCATTTGAC CAGTTCATCA 540 

TTGCAAAAAC ACCTGCTGCA CGTTGGGGAA ATACTGAAGA TTTGATGGGC CCTGCTGTCT 600 

TTCTCGCTAG TGATGCCAGC AATTTTGTCA ATGGCCACAT CCTATATGTA GATGGCGGTA 660 

TCTTAGCCTA CATCGGAAAA CAACCTGAGT AAAAATAGAA AGAAGATCTT ATGAAAATCG 720 

CATTAATCAA TGAAAATAGT CAAGCTAGCA AGAATCACAT TATTTACGAT AGTCTAAAAG 780 

AAGCGACAGA TAAAAAAGGC TACCAATTAT TTAACTATGG TATGCGTGGA GAAGAAGGAG 840 

AAAGTCAATT AACTTATGTG CAGAACGGAC TAATGGCTGC CATCCTTTTA AATACAAAGG 900 

CAGTTGACTT TGTTGTTACC GGCTGTGGTA CGGGTGTAGG GGCTATGCTT GCTTTAAACA 960 

GCTTCCCTGG TGTTGTCTGT GGTCTAGCAG TGGACCCAAC TGACGCTTAC CTTTATTCTC 1020 

AAATCAATGG TGGTAACGCC TTGTCTATCC CTTATGCCAA AGGATTTGGC TGGGGGGCAG 1080 

AACTGACCCT CAAATTGATG TTTGAACGCT TATTTGCTGA AGAAATGGGC GGTGGCTACC 1140 
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CAAGAGAACG TGTAATCCCT GAACAACGCA ACGCTCGTAT CTTAAACGAG GTGAAACAAA 1200 

TCACCCACAA TGATTTGATG ACCATCCTTA AAATAATCGA CCAAGACTTC CTCAAAGACA 1260 

CCATCTCTGG CAAATACTTC CAAGAATACT TCTTTGAAAA CTGCCAAGAT GATGAAGTTG 1320 

CTGCTTATTT GAAAGAAGTA TTAGCCAAGT AAAGCTATTC TAAACCAGAA AGGAACTAAT 1380 

GGATGACGAA AATATTACTG TTTGGCGAAC CATTAATTCG AATTTCACCA TTAGATGCCA 1440 

CCAGTATCGG CGATCATGTT GCCAGTTCGA CTTATTTTGG CGGATCAGAA ATTAACATCG 1500 

CTTGTAATTT GCAAGCCCTG GGTATCTCAA CGAAAGTTTT TACCGCACTC CCTGCCAACG 1560 

AGATTGGAGA TCGTTTTCTC ACATTCTTGA AACAGCACCA AATCGATACC AGTTCAATCT 1620 

GTCGGCTTGG CGATCGAATC GGCCTCTACT ATTTGGAGAA CGGCTTTGGT TGTCGTCAAA 1680 

GTGAAGTTTT CTACGATCGT AAGCATACGA GTATCAGCCA GATTCGGCCA AACATGCTAG 1740 

ATATGGATTC TCTCTTTCAG GGGATTAGCC ATTTTCATTT TAGTGGAATC ACCGTAGCTA 1800 

TCGGTCAAGA GGTCCGTGCG ATCCTTCTCC TACTCTTGGA AGAAGCCAAG CGCCGAGGAA I860 

TTGTCGTTTC AATGGATCTC AATCTGAGAA CAAAGATGAT TTCAGTCCTA GAAGCCAAGT 1920 

ATGAATTTTC TAAGTTTGCA CGTTTTACTG ACTATTGCTT CGGTATTGAT CCTCTCATGA 1980 

TTGATGACCA AAATCTAGAG ATGTTTCCAA GAGACAGTGC TAGCCTAGAA GAGGTGGAAA 2040 

ATCGCATGCG ACTTTTAAAA GAAGCCTATG GTTTCAAGGC CATTTTCCAT ACCCTCCGCT 2100 

CTAGTGATGA GCAAGACAAA AATGTCTATC AAGCCTATGC TCTAGAAGAA CTATTTGAAG 2160 

AGTCTGTCCA ACTAAAAACT GCAGTCTATC AACGAATTGG TAGCGGGGAT GCCTTTATAT 2220 

CTGGTGCCCT TTACCAACTA CTCCATCATT CCTCCCTAAA AACTACCATT GACTTTGCAG 2280 

TTGCGAGCGC AACTCTCAAA TGCACTCTTC CAGGAGACCA TCTCTCCACT TCCTCAACTA 2340 

GTATTGAAAA TTTACTGGCA AATGCACAAG ATATCATTCG TTAGGAGAAT TACATGACCA 2400 

AATCAGATAC GATTATTGAA CTAAAAAAAC AAAAAATTGT CGCTGTTATT CGAGGAAATA 2460 

CAAAGGAAGA AGGACTACAA GCCTCGATTG CTTGTATCAA GGGCGGTATC AAAGCTATTG 2520 

AAATCGCCTA TACCAATCAG TATGCAGGAC AAATCATCAA GGAACTTGTA GACTTGTATC 2580 

AGGACGATCA GAGTGTTTGT ATCGGTGCAG GTACTGTGCT TGATGCCGTA ACTGCTAGAG 2640 

ATGCCATTCT AGCTGGAGCA AATTACGTTG TTTCTCCATC TTTCCATGCT GAAACTGCGA 2700 

AAATGTGCAA TCTCTACAGC ACACCGTACA TTCCAGGCTG TATTACCCTC ACAGAGATCA 2760 

CGACTGCACT TGAAGCCGGT AGTGAAATCA TCAAACTCTT CCCAGGTAGT ACTCTCAGTC 2820 

CAGCATATAT CTCTGCAGTC AAGGCACCGA TCCCACAAGT TTCCGTAATG GTAACCGGAG 2880 
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GAGTCGGCCT AAACAACATC CCTCAATGGT TCGCTGCTGG TGCAGATGCC GTTGGAATTG 2940 

GTGGCGAACT CAATAAACTC GCTTCCCAAG GCAACTTTGA CCGCATCAGC GAGATTGCCC 3000 

AACAGTATAT TACACTCAGA TAAAATCATA ACTACCCGTC TAACGGGTGG TTTATCTCAG 3060 

AGCTATAAGC CCAAATCATC AGCCAGCGCC TAAAGACGCT GGCTTTCACG TTGTTCAAGC 3120 

CTTATTGCTC TTGACTCGTC ACTTGCCTCT TTAAGAGACT TTGGTATTAC TTACCACTAT 3180 

CCCTAAAGGG ATCCTCATAT TCTTTTACAC TCAATTTATC TAGTGCTATA GTAGATTGAA 3240 

ACTGGAATAG TACACCTCTG CTTCTAAAAC ATTGTTAAAA ATCGATTTGA CTGTCCTGAT 3300 

CGATTTTGTC CTGTTCTTAT TTCATTTTAC TATATATCAT ACTTTACTCG TTCTCAAATT 3360 

TTCATACTCA TGAAGAAATC ATCCACTCGA TAATTTCTTT AATCTTGACT ATATTTCTTA 3420 

ATTGTGGCTT CATTAAGCCC TACTGGACTT ACATAATAAC CTTCCTCCCA GAAATGCCGA 3480 

TTCCCAAACT TGTACTTGAG ATTGGCGTGT TTGTCAAACA TCATGAGTGC ACTTTTGCCT 3540 

TTTAAATACC CCATAAAACT TGAAACACTT AGCCTCGACG GAATACTGAC TAACATGTGT 3600 

ACATGGTCTG GCATTAAGTG ACCCTCGATC ATTTCAACAC CTTTATAACT ACACAAGCGA 3660 

TGAAATATTT CGTCTAAACT ACTTCTATAT TGATTATAGA TGACTTTTCG TCTATACTTA 3720 

GGGGTGAACA CAATATGATA GAACACCTCC ACTTTGTGTA TGATAAACTA TGAGTCTTTT 3780 

GTGCCATATT TTTTCTCCTT TCGCTTTACA ATTGGATTGA ACACCTTTAT TGTATCGCGT 3840 

TTGGAGTTTT TTTGGTATAA CCTTCGACGC GCACCCGTAT AGCGGGTGGT TGTTTTGTCT 3900 

CGCACCTCAC GGAGCGAGAC GGACTAATAT AGTGGAGTGA AATAGGATAC GAACAAATTG 3960 

ATTAGGAAAA TCAAATGAAT TTATAGAAAT CTTTTAGCAG TTATAACGTT CTATTCTAGT 4020 

TTCAAAACGC TATAGTCACA TAATAATGAA GTAAAAAAGG ATAAGTATCA ACTTATCCTT 4080 

TTTTAAAAGA AAAATCCGAA GATATTTGGC CTTCTTCGGA TTTTTTCTAT TTTCCACAGT 4140 

TTCATGTAAT TCATCTAGAT GATGAACAAA TTAGTTGTTC TTTCCTCTAC GGAATAGATA 4200 

AAATGCCCCA AGTAGCAAGA ACCCTAGACT TGCCAAGATT GACTGACCTT CTCCTGTCTG 4260 

AGGGAGATTC TTTTGATCCG AATGGTTCTT TTCCTCTTCA GATTTTTCCT TTTCTTTTGA 4320 

ATTCTGTACT TGTGGCTGAG CTGCTTGCTC TAGCTTTTTA AAGACTTCCT GATCTGGAGC 4380 

TGATTCCTGG GTTTCAGGAT TATAGTAGGC AATCTTATAT TCATCCCCTT CTTTTCGAAT 4440 

GGTATAGACT CCACGTTTCA AAACTTGGAA TTGGTTGGAA ATAGTAGAGA CAGAATCATC 4500 

ATATTTCACA ATGCCCCAAA CTCCTTGTTT AG CAT CAT AA ACAGACTGAA GGGTTTCGTT 4560 

ATTTTCGATG AGGCTACTTT CTAACTCTTT TATCATTTGA TTGAAGGTGG CACGATCCAC 4620 

GTTAGGAATG AGCATATAGC CATAAGAATC TCTATTTTGC TTATGAGCCT GACTAATCGT 4680 
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AAGAAATTCA TTTTCAACTT CCTTGTCTGA CTGTCCTTCA TTGATATCCT TCCAGGCTCC 4740 

CTTTTGCAAA GCCTTACTCA TACTGATTGA ACTCTTCTTA AAGAAAAAGT AACCAATATT 4800 

CTTTTTCGAA TCGAACGATT CTAAAAAGAC ACTTTGGGTT TCAGGATAAT CCTTTTCTTG 4860 

TTCTGTAAGG GAGGCTTCTT TATCATTGAC ATAGACTTTA TATGGATTAC CTGATTCCAG 4920 

TTTTCTCTGG TCAATTGTAG TTGCAGCAGT ATCTGTTGAA GTGTTTTGGA TATTGCTTCC 4980 

TAAAAAGGCG ATCTTATCCT TTAGCATAAA CCAGCTCTTA TGAGCAGTCA ATGTTTGATT 5040 

CCAGTTGGTG AAATCCATGG TTGCTGTCGC ATTGGCATCA TCTAGTTTGC TCGTTCCAAC 5100 

GAAAGCAGAC GGTAAAACTT TACCTGTATC GCTATCCGCT CTCTTAGCAT CCGTCTCTGT 5160 

TGTACCAGGC ATCTTATATG GATTAACTGT TGGCCAGTAG CCATCGCTAT AGTGACTCAA 5220 

ATCGCCATTG TAAAGATAGA ACATCCCATC ACTCGTATAC CAACCACGTT TATTTTCCTT 5280 

GTTCATGTGT TCGTAATTCA AGGTACGACT GGAAAAGAGT GACAAGCCAA ATCCAAACCC 5340 

TTTCTCTGCA TTGTACATGG CTGTTTTATC CATCTTGTTA AAGGCAGATA GGTAACTTGG 5400 

TCTTGGAACA CTTGCGACTC CTGCATCACT TAACAAGGAT TGCATCAAAC TGATATCCTT 5460 

ATAAGTCTTC AAATTCTTAA AGACATCATA ATAACTATCC GATTGAACAA TGGTCTTCAC 5520 

AAGACTCTGC AAACATTGTT TGGTTTCTCC TTCAGACATA TCCGCTATTC GGTGAATCCC 5580 

TCTTAGTACT TCTACTGCGG CCACGTGCCC CTCGCTATTT GCACGACTGA TCGAGCGTCC 5640 

ACGACTCATA TCCATCAACT CTCCATTCAC CAGCAAAGGA GCAAACGATT TATCAATCCA 5700 

GTGGTACATG GTTTGCATTT TATCTTTATC GATTGGATTC TTGGTCTTTT GAATGACTGG 5760 

CAACAGTTGA GACAGGCCAT CAATCAAAAC ATTCCCATAA GCACCCGTAT AGGCAACATT 5820 

GGTGTGGTCG ATATAGGATC CATCTTGATA AAAACCTTCA CCTTGGTCTA CCAACTTGAA 5880 

CACTTGCTCA ATCGAGCGAA TGGTAGAAGA AATTTCTTGA TCATCCTTAC GCAGTAAACC 5940 

AG CT ATT ACT TTTACCCTTC CCATATCAAC TAAGTTTCCA CCTAGAGCCT TGAATGGGTT 6000 

ATCAGTCGTC TTTCGGAAAT GTTCGGGATC TGGTACAAAT TTTTCAATCA CATCTGTATA 6060 

TTTTTTAATT TCCTCATCAG AGAAGTATTC TTTCATCAGA GACAAGGTAT TGTTGATGGC 6120 

ACGAGGTGTA CCGATTTCAT AATCCCACCA GTTCCCAACA ATGCTCTTTT CACTATTGTA 6180 

GACATGTTTA TGCATCCATT CCATGGAATC CCTGACTGTT CGAACGACAG TTTCATCTTG- 6240 

ATAATAACGA GAAGAAGGAT TGGTCACTTG CTTGGCCATC TCCTCCAATT TCCGATAAGT 6300 

GGCAGTCAGA TTTGCAGACG TTTTATAATT TGAAAATTTT TCCCACAAAT AGGTGCGGTC 6360 

CGCCTGACTT GAAATACTGG ATAGGCTATC AGCTACCTTT CCTTCCAATT CCTGGTTTAA 6420 
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TTTGGCCATC TGTTCATTTT TAGAATCATA GTATTGATTC CCAGCGATGA TGCCATTCCA 6480 

GTCATCCAAA CGGTCTGTGT ATGCATCCTT AACAGAGGCC AGAATCTTCA AAGGAATCTT 6540 

TTTCACTTCC TTGCCATCTT TACTGACAAT GACATTGGTT GTCCCTTCCT TAAGAGGTTC 6600 

TAAAATTCCA TTTTTGACTG AAGCAACGTC AGGATTTTCT ACCTTATAAG TATAGTCCGC 6660 

AAGAGAAAAA ACATGTTTTT TTCCAATTGG TAAATCAATC TTTTCCTCAA GCTGTTTATC 6720 

TGTTTGAGAA TCCTCAGAAA GCTGGTCTGC TACCTCTACC AGCTCAATAT CCTTAAAGGA 6780 

AACAGTCCCA GTTCCTGTTT CATAGAATAA CTCCAGCTTG ATTTTATCAA CATCTAAAGT 6840 

CGGGCTATAG TCTGCTTCAA TGGTCTGCCA GTCCTTTGTT CCTGACGTCG TTGCAGAATT 6900 

CCACAATCGC TTGTCCTTAC CACTTTCCTC AATGATACGA ACTTTGGCAA TCCCGATTTT 6960 

ATTATCTGTT TTAATCTTGA AACGCAGTTT ATACTTTTTC TTAGCTTCAA TAGGAACCAT 7020 

ACGGTGAAGC GCTGCCCTTA ATTTCTCATG GCTTGAGATA GTGATAGCCC CATCCTTAGC 7080 

CTCAATGACT CGAGTTGAGG CATCTGCACT ATTCTTCTGG TCTACCCAAG CTGACCACCC 7140 

CCTGAGCTTT GCTTCCTGTC CGG 7 163 
(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9244 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
{D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

CGTTATAACA TACATGTAAG CGGTACCCAA AATGGTGCCA AGTCAAAATT TTTAAGGAGG 60 

AAAATACATG TCTTCACATC CAATTCAGGT CTTCTCAGAA ATTGGGAAAC TGAAAAAAGT 120 

TATGTTGCAC CGTCCAGGCA AGGAGTTAGA AAACTTGTTG CCGGACTATC TTGAAAGGCT 180 

TCTTTTTGAT GATATTCCTT TCTTGGAAGA TGCTCAAAAA GAACATGATG CATTTGCCCA 240 

AGCTCTTCGC GATGAAGGAA TTGAGGTTCT CTACCTAGAA CAACTCGCTG CTGAATCATT 300 

GACCTCTCCA GAAATCCGCG ATCAATTTAT CGAGGAATAC TTAGACGAAG CCAACATCCG 360 

TGATCGTCAA ACCAAGGTTG CTATTCGTGA ATTGCTTCAC GGCATCAAGG ACAACCAAGA 420 

ATTGGTTGAA AAAACAATGG CTGGGATTCA AAAAGTTGAA TTGCCAGAAA TTCCTGACGA 480 

AGCTAAAGAT CTAACTGACT TAGTTGAATC AGAGTATCCA TTTGCAATTG ACCCGATGCC 540 

AAACCTCTAT TTCACTCGCG ACCCATTTGC AACAATTGGA AACGCCGTAT CGCTTAACCA 600 

CATGTTTGCA GACACTCGTA ACCGTGAAAC ACTCTACGGT AAGTATATCT TCAAATACCA 660 
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CCCAATCTAT GGCGGAAAAG TGGATTTGGT CTACAACCGT GAAGAAGATA CGCGTATCGA 720 

AGGTGGAGAC GAGTTAGTTC TTTCTAAAGA CGTCCTTGCA GTAGGTATCT CTCAACGTAC 780 

AGACGCAGCT TCTATCGAAA AACTTTTGGT CAACATCTTC AAGAAAAATG TTGGCTTCAA 840 

GAAAGTTTTG GCCTTTGAAT TTGCTAACAA CCGTAAATTC ATGCACTTGG ATACTGTCTT 900 

CACTATGGTA GACTATGACA AGTTCACTAT TCACCCAGAA ATCGAAGGCG ACCTTCACGT 960 

TTACTCAGTT ACTTACGAAA ACGAAAAACT TAAAATCGTT GAAGAGAAAG GTGACTTAGC 1020 

TGAACTTCTT GCTCAAAACC TTGGTGTAGA AAAAGTTCAT TTGATTCGTT GCGGTGGTGG 1080 

CAATATCGTA GCAGCTGCGC GTGAACAATG GAACGACGGT TCTAACACTT TGACCATCGC 1140 

ACCTGGTGTG GTAGTTGTTT ATGACCGCAA TACCGTGACC AATAAGATTT TGGAAGAATA 1200 

CGGGCTTCGC TTGATTAAGA TTCGCGGAAG TGAATTGGTT CGGGGCCGTG GTGGACCTCG 1260 

TTGTATGTCT ATGCCATTTG AACGTGAAGA AGTGTAATCG CTGTTCGATA TTCGTCAATA 1320 

GAAAATGTAA AAAATAGAAA GAGGAAATAA TAAAATGACA AATTCAGTAT TCCAAGGACG 1380 

CAGCTTCTTA GCAGAAAAAG ACTTTACCCG TGCAGAGTTA GAATACCTTA TTGGTCTTTC 1440 

AGCTCACTTG AAAGATTTGA AAAAACGCAA TATTCAACAC CACTACCTTG CTGGCAAGAA 1500 

TATCGCTCTC CTATTTGAAA AAACATCTAC TCGTACTCGT GCAGCCTTTA CAACTGCGGC 1560 

TATCGACCTT GGTGCTCACC CAGAATACCT CGGAGCAAAT GATATTCAGT TGGGTAAAAA 1620 

AGAATCTACT GAAGATACTG CTAAAGTATT GGGACGTATG TTTGACGGGA TTGAATTCCG 1680 

CGGATTCAGC CAACGTATGG TTGAAGAATT GGCAGAATTC TCAGGCGTTC CAGTATGGAA 1740 

CGGTCTAACT GACGAATGGC ACCCAACTCA AATGCTCGCT GACTACTTGA CTGTTCAAGA 1800 

AAACTTCGGT CGCTTGGAAG GCTTGACATT GGTATACTGT GGTGATGGAC GTAACAACGT 1860 

TGCCAACAGC TTGCTCGTAA CAGGTGCTAT CCTTGGTGTC AATGTTCACA TCTTCTCACC 1920 

AAAAGAACTC TTCCCAGAAA AAGAAATCGT TGAATTGGCA GAAGGATTTG CTAAAGAAAG 1980 

TGGCGCACAT GTTCTCATCA CTGAAGATGC TGATGAAGCA GTTAAAGATG CAGACGTTCT 2040 

TTACACAGAC GTTTGGGTAT CAATGGGTGA AGAAGACAAA TTCGCAGAAC GTGTAGCTCT 2100 

TCTTAAACCT TACCAAGTCA ATATGGACTT AGTTAAAAAA GCAGGCAATG AAAACTTGAT 2160 

CTTCCTACAC TGCTTGCCAG CATTCCACGA TACTCACACT GTTTATGGTA AAGACGTTGC - 2220 

TGAAAAATTT GGTGTAGAAG AAATGGAAGT AACAGACGAA GTCTTCCGCA GCAAGTACGC 2280 

TCGCCACTTC GATCAAGCAG AAAACCGTAT GCACACTATC AAAGCTGTTA TGGCTGCTAC 2340 

ACTTGGTAAC CTTTATATTC CTAAAGTATA ATTTTAGATA ATAAACCGTC TACCAACAGC 2400 
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TATGAGGGCT GCGACTAATA GCTTTAGTCC GGTCCTCTTT TATGTAATGG TAATCTATTA 2460 

TTTCTTATAA AATATGTGAA AAATCATTAA ATTGAAATCT AAACGCATTC TATTGAGTGT 2520 

GATAAAGGAG AATTTATGGC AAATCGTAAA ATTGTAGTAG CTTTGGGAGG AAATGCGATT 2580 

CTTTCTTCTG ACCCATCAGC AAAGGCTCAA CAAGAAGCTT TAGTTGAAAC AGCTAAGCAT 2640 

CTTGTAAAAT TGATTAAAAA TGGAGATGAT CTGATTATCA CTCACGGTAA TGGACCTCAA 2700 

GTTGGGAATC TCTTGCTCCA ACATTTGGCA TCAGACTCTG AAAAGAACCC TGCCTTCCCA 2760 

CTCGACTCAC TTGTCGCTAT GACAGAAGGT AGCATCGGTT TCTGGTTGAA AAATGCTTTG 2820 

CAAAATGCTC TCTTGGATGA AGGCATCGAA AAAAATGTTG CCTCTGTTGT AACGCAAGTT 2880 

GTCGTAGATA AAAATGATCC AGCTTTTGTT AACTTGAGTA AACCAATCGG TCCTTTCTAT 2940 

TCAGAAGAAG AAGCAAAAGC AGAAGCCGAA AAAAGCGGAG CGACTTTCAA GGAAGATGCT 3000 

GGCCGTGGCT GGCGTAAGGT CGTTGCCTCA CCAAAACCTG TTGACATCAA AGAAATTGAA 3060 

ACCATCCGTA CTCTTTTAAA TAATGGTCAA GTCGTCGTAG CTGCAGGTGG TGGCGGTATT 3120 

CCCGTCGTCA AAGAAAACAA TGGACATTTG ACTGGTGTCG AAGCGGTTAT TGATAAAGAC 3180 

TTCGCTTCCC AACGTTTGGC AGAATTGGTT GATGCAGACC TCTTCATCGT TTTGACAGGT 3240 

GTAGATTATG TATTTGTTAA CTACAACAAG CCAAACCAGG AAAAATTGGA ACATGTGAAT 3300 

GTTGCCCAGC TGGAAGAATA TATCAAACAA GATCAGTTTG CACCAGGTAG CATGCTTCCA 3360 

AAAGTAGAAG CAGCTATCGC TTTTGTCAAT GGTCGTCCAG AAGGAAAAGC AGTTATTACT 3420 

TCCCTTGAAA ATCTAGGCGC CTTGATTGAA TCTGAAAGCG GAACAATTAT TGAAAAAGGA 3480 

TAAGTTGTTT TACTAATAAG ATGTATTCTA TTTCTAGTAT CTTTATATCA AATTAGAAAT 3540 

TATTCTTGAA AACATGTACA ATATTTCAAA AGATACTAGT TTTAGACTTT AATATGGTAA 3600 

AACAAATATA AATAGAAAGC GTTTTCTTGA ATGTTTATTT AAGAAAGTAG TTGGTTTTTT 3660 

ACACTTTGTT AGACATCAGG AGGAAAAACA AATGAGTGAA AAAGCTAAAA AAGGGTTTAA 3720 

GATGCCTTCA TCTTACACCG TATTATTGAT AATCATTGCT ATTATGGCAG TGCTAACTTG 3780 

GTTTATCCCT GCGGGGGCCT TTATAGAAGG TATTTACGAG ACTCAGCCTC AAAATCCACA 3840 

AGGGATTTGG GATGTCCTCA TGGCACCGAT TCGGGCTATG CTAGGTACTC ATCCAGAGGA 39Q0 

AGGTTCGCTC ATTAAAGAAA CGAGCGCAGC GATTGATGTA GCCTTCTTCA TCCTTATGGT 3960 

TGGTGGTTTC CTTGGCATTG TCAACAAAAC TGGTGCTCTT GACGTAGGGA TTGCCTCTAT 4020 

CGTGAAGAAG TATAAGGGCC GCGAAAAAAT GTTAATTTTG GTACTGATGC CTTTGTTTGC 4080 

CCTCGGTGGT ACAACTTATG GTATGGGTGA AGAAACAATG GCCTTCTATC CACTCCTTGT 4140 

GCCAGTTATG ATGGCCGTTG GTTTTGATAG CCTGACTGGT GTTGCAATTA TTTTGCTCGG 4200 
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TTCTCAAATC GGCTGTTTGG CATCTACTCT GAATCCATTT GCGACAGGTA TTGCTTCAGC 4260 

GACTGCGGGA GTTGGTACAG GGGACGGTAT CGTACTTCGT CTGATCTTCT GGGTTACCTT 4320 

GACTGCTCTT AGTACTTGGT TTGTTTACCG TTATGCGGAT AAGATTCAAA AAGATCCGAC 4380 

TAAGTCACTG GTTTATAGTA CTCGCAAAGA AGATTTGAAA CACTTTAACG TAGAAGAATC 4440 

TTCATCTGTA GAATCTACAC TTAGCAGCAA ACAAAAATCA GTTCTCTTCT TATTTGTGTT 4500 

GACATTCATC TTGATGGTAT TGAGCTTCAT TCCATGGACA GACCTTGGCG TTACCATTTT 4560 

TGATGACTTT AATACTTGGT TGACTGGTCT TCCAGTTATT GGTAATATTG TCGGTTCATC 4620 

TACTTCTGCA CTAGGTACTT GGTACTTCCC AGAAGGCGCA ATGCTCTTTG CCTTTATGGG 4680 

TATCCTGATT GGTGTTATTT ATGGTCTTAA AGAAGATAAG ATTATCTCTT CCTTCATGAA 4740 

TGGTGCTGCT GACTTGCTCA GTGTTGCCTT GATCGTAGCG ATTGCTCGTG GTATTCAAGT 4800 

TATCATGAAC GACGGTATGA TTACCGATAC AATCCTCAAC TGGGGTAAAG AAGGCTTGAG 4860 

CGGTCTATCT TCACAAGTCT TTATCGTTGT AACTTATATC TTCTATCTAC CTATGTCATT 4920 

CTTGATCCCA TCTTCATCTG GTCTTGCCAG CGCAACTATG GGTATCATGG CTCCACTTGG 4980 

AGAATTTGTA AATGTCCGTC CTAGCTTGAT TATCACTGCT TACCAATCTG CTTCAGGTGT 5040 

CTTGAACTTG ATTGCACCAA CATCTGGTAT TGTGATGGGA GCTCTTGCAC TTGGACGTAT 5100 

CAACATTGGT ACTTGGTGGA AATTCATGGG CAAACTCGTA GTCGCTATTA TTGTAGTGAC 5160 

CATCGCCCTT CTTCTCCTTG GAACCTTCCT TCCATTCCTA TAAAATAGTG AGTGAGGTGA 5220 

TTCCATGAAA ATAGATATAA CAAATCAAGT TAAAGATGAA TTTCTTATAT CATTAAAAAC 5280 

CTTGATTTCC TATCCTTCAG TACTCAATGA AGGAGAAAAT GGAACACCTT TTGGACAAGC 5340 

AATCCAAGAT GTCCTAGAAA AAACTTTAGA GATTTGTCGA GACATAGGTT TCACTACCTA 5400 

TCTTGACCCT AAAGGTTATT ACGGATATGC AGAAATCGGT CAGGGAGCAG AGCTTCTGGC 5460 

CATTCTCTGT CATTTGGATG TTGTTCCATC AGGTGATGAA GCAGATTGGC AGACACCGCC 5520 

ATTTGAAGCA ACTATCAAAG ACGGCTGGGT ATTCGGACGT GGTGTCCAAG ATGATAAAGG 5580 

CCCTTCGCTC GCAGCTCTCT ATGCAGTAAA AAGCTTGCTG GACCAAGGTA TTCAGTTCAA 5640 

AAAGCGCGTA CGCTTTATCT TTGGTACCGA TGAGGAAACC CTCTGGCGCT GCATGGCACG 5700 
CTACAATACC ATCGAAGAAC AGGCCAGTAT GGGCTTTGCA CCTGACTCAT CTTTTCCTCT - 5760 

GACCTATGCT GAAAAAGGGC TTCTACAGGT CAAACTTCAT GGCCCTGGAT CGGATCAACT 5820 

AGAGCTTGAA GTAGGAGGCG CCTTTAACGT TGTACCAGAC AAGGCCAACT ACCAAGGTCT 5880 

CCTCTATGAA CAGGTTTGTA ACGGTCTCAA AGAAGCTGGT TATGATTACC AAACCACTGA 5940 
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AACACCGTTT TGAGGTTGCA GATAGAACTG ACGAAgTCAG CTCAAAACAC TGTTTTGAGG 7800 

TTGCAGATAG AACTGACGAA GTCAGTAACA TCTATACGGC AAGGCGACGC TGACGTGGTT 7860 

TGAAGAGATT TTCGAAGAGT ATTAGTCTAT TATTTCTTCT CAGCGCGAAG GGCTGACAAG 7920 

ATTTGTGTTC GGATATCATC CACACCATTT GGAGTATTTG GTAAAAAGAT AGTTTGATTT 7980 

CCTTTAGAGG CAAAGGTATT CAAGGTATCC AAATACTGGT TGGTCAAGAG GATAGACATG 8040 

ATTTGTTCTT CTGTCATGCC AACATTGGCT TCCTTGAGTT CGGTGATAGA CTCTGCCAAT 8100 

CCATCCACAA TCGCCTTACG TTGTTGGGCA ATCCCCACAC CATGAAGGCG GTCTTTTTCT 8160 

GCTTCTGCTT CAGCTGCAGT GACAATTTTA ATCTTGTCAG CTTCCGCCAA TTCTTGTGCT 8220 

GCGACCCGCT TACGTTGCGC CGCATTGATT TCATTCATGG ATTGCTTAAC TTCTGCATCT 8280 

GGTTCGACCT TGGTAATCAA GGTTTTCACG ATAATGTAGC CGTAAGTGGT CATTTCTTCT 8340 

GCTACTTGGT GTTGAACTTC AAGGGCAATC TCATCTTTTT TCTCAAACAA TTCATCCAAG 8400 

GTTAATTTTG GAACAGAAGA GCGAAGAGCA TCTTCGATAT AAGATTTAAT CTGAGATTCT 8460 

GGACGTATGA GTTTATAGTA AGCATCTGTC ACGCTCTGCT CGTTGACACG GTACTGAGTC 8520 

GCTACATTCA TCATAACGAA CACATTGTCC TTGGTCTTAG TCTCAACCAC AATATCACTT 8580 

TGCAACAAGC GCAACTGAAT CCGTGCTGCA ATCGAGTCAA TCCCAAAAGG CAAGCGAATA 8640 

TGAATACCGC TATTAGCAAC CTTTTGGTAT TTCCCAAAGC GTTCAATAAT CGCCACCGAC 8700 

TGCTGACGAA CCACATAAAC TGTACTCAGT GTGACTATCA CCAATAGGAG CACACAAACA 8760 

ATCAGAAAAA TCATGAAAAA TATTGCCATA ATGGAACCTC CACAAGTATT TTTCTAGTAT 8820 

TATAGCACAT TTAAAGAAGG CTGTGCCGTT TTTACTGCGA TTTTTCCTGA AATGTCAATA 8880 

ATTAGAGGTG AATTGTCCTA TTGTCGTCCA ATCTCTTGCT AAAATAACTC TTTATAAAAG 8940 

GCAATCGTTT CTTCTAAGGT TGGCATAAAT GGATTTCCTG GTGCGCAGGC ATCAATCAAG 9000 

GCATTCTTAG AAAGGTATTC AAAGTCGAAA TCTTTTTCTT CAATACCAAG TTCAGTCAGT 9060 

TTCTTAGGAA TACCTACTGT CTCAGAAAGC TTCTCAATCT CAGCAATCGC ATAATCGGCA 9120 

CATTCTTGAT CTGATTTACC TTCTACATGA AGTCCCAAGG CTTTGGCAAC ATTGCGGAAA 9180 

GCTTCTGGTA CACGTTTAGC ATTTTCACGT TCTATAACTG GTAGCAACAT GGCACAGCAC 9240 

ACGG 9244 

(2) INFORMATION FOR SEQ ID NO: 69: . 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8898 base pairs 
{B) TYPE: nucleic acid 
(CJ STRANDEDNESS : double 
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(D) TOPOLOGY; linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 



GATCTGAACT 


TTATCATCAT 


AACTTAATTT 


CATAATAAAA 


ACACCCCAAA 


AGTTAGATTT 


60 


TTTCTGTCTA 


ACTTTTGGGG 


TGTAGTTCAG 


TCATTGGACT 


GACGTTTTTT 


TGTATGCTTA 


120 


TTTTGATTTG 


ATGTAGTTGA 


TACCATCTGC 


TTTTGGTGCG 


ACTGCTTTTC 


CAAAGAAGGC 


180 


TGCTAAGACA 


AGAATTGTCA 


AAACATAAGG 


TGCAATTTGA 


AGATAAACCG 


CTGGCACTCC 


240 


TTGTAGGAAC 


GGCAATTGAG 


AACCGATAAC 


AGCCAAACTT 


TGTGAAAGTC 


CAAAGAAGAG 


300 


ACTAGAAAGC 


ATAGCACCGA 


TTGGATTCCA 


TTTCCCAAAG 


ATCATCGCAG 


CAAGGGCGAT 


360 


AAATCCAGGT 


CCAACAATAG 


TTGTCACTGA 


GAAGTTAACT 


GAGATTGATT 


GCGCATAAAT 


420 


CGCTCCGCCA 


ATTCCACCTA 


GAAAACCTGA 


AATAATAACC 


CCTAAATATC 


TCATCTTGTA 


480 


GACGTTGATT 


CCCAAGGTAT 


CCGCTGCTTG 


AGGATGTTCA 


CCGACAGAGC 


GGAGACGAAG 


540 


ACCAAATTGA 


GTCTTAAAGA 


GAATAAACCA 


AGCAAGGAAT 


GAGAAGGCAA 


TCGCCAGATA 


600 


ACCAAGTAGA 


CTAGTTGACT 


TGAAGAAGAT 


ATCACCAATC 


ACTGGGATAT 


TTGCCAAGAC 


660 


TGGGAAATCA 


AAGCGTCCAA 


AAGTTTGACT 


TAGGTTGTCG 


GTTTGTCCTT 


TGTTATAAAG 


720 


AACTTTAACT 


AAGAAAACAG 


CCAAGGCAGG 


CGCCATCAAG 


TTCAATACCG 


TACCGCTGAC 


780 


AACATGGTCT 


GCACGGAAAT 


GAACCGTCGC 


TGCTGCGTGG 


ATGATAGAGA 


AAACACTACC 


840 


AACCAATCCT 


GCTACAAGCA 


AGGATAGCCA 


TGGAGTTGCT 


GCTCCAAATT 


GTTCTGCAAA 


900 


TTCAAGGTTA 


AAGACAACTC 


CAGAAAAGGC 


ACCCATAACC 


ATAATTCCTT 


CAAGGCCAAC 


960 


GTTTACCACA 


CCACCACGTT 


CAGAGAAAAC 


ACCACCGATA 


CTTGTAAAGA 


TGAGAGGTGC 


1020 


TGAGTAAATC 


AGCATAGAAG 


ACACCAAGAG 


GGGGAGCAAG 


GTTATAATAG 


ACATCTTTAC 


1080 


TTACCTCCTT 


TAACTTGTTT 


TTTCGGTTTG 


ACAAAGCGTT 


CGATAAGGTA 


ATGAACACTG 


1140 


ACAAAGAAGA 


TAATAGACGC 


TGTTACAATG 


CTGACAAGCT 


CAGATGGTAC 


CTGCGCCGCA 


1200 


TTCATACCAG 


GAGCCCCAAC 


TTGGAGAACG 


CCAAATAGGA 


AGGCTGCAAA 


GAGTATACCA 


1260 


ATTGGTGAGT 


TGGCCGCAAG 


CAAACTAACC 


GCCATTCCGT 


TAAATCCGAT 


AGCTAATGAC 


1320 


GAACCTTGAA 


CATAGACGTT 


CTGGAAGGTT 


CCCAAACCTT 


CAACAGCTCC 


ACCAAGACCT 


1380 


GCCAAGGCAC 


CTGAAATAAT 


CATAGATAGG 


ATAATAGTCC 


GCTTGGCAGA 


AATACCAGCA 


1440 


TATTCTGAAG 


CATGTGGATT 


AAGACCAACT 


GCACGGATTT 


CAAAACCAAG 


AGTTGTTTTC 


1500 


TTGAGCATGA ACCAAATAAC 


TGCAACGGCA 


ATGATGGCAA 


AGAAAATACC 


AATATTCATC 


1560 


CGTGAGTTAC 


CAGTCAACTC 


AGCCAACCAA 


GGTGTCTGAT 


AGGTTGCATT 


AGCCCCAACA 


1620 



WO 98/18931 



PCT/US97/19588 



563 

CGAATGGTCG AATCTGTACT TTGCATGAAG TCTTTAGGGA AAGCATGGAT AAAGGCATTC 1680 

CCTACATACA AGACAATGTA GTTCATCATG ATGGTTACAA TAACCTCTGA CGTCCCTAGA 1740 

TAGGCCCTAA GAATACCTGG AATCGCTCCG ACAATCCCAC CAGCAATCAA GGCAATCACG 1800 

ATGGTTGCTA GAATCATCAA GGGACGGGGC ATATCTGGAT GCGACAGGGC AAACCAACCA 1860 

CTGAGAATCC AACCTGCCAA AGCCTGACCA GGAAGTCCGA CGTTAAAGAA ACCAGCTCGA 1920 

CTGGCAACGG CAAAACCAAG ACCAATCAAG ACCAGAGGAC CCATAGCACG GAAGATTTCT 1980 

CCAATCCCAC GCAGACTGCC AAAGGCTGTA TAGAACAATT CTTCGTAGCC CCAAATAGCA 2040 

TCATAACCGA AGATCCACAT GACAATGGCT CCGAGTAAAA TTCCTAGGAA TACAGAAATC 2100 

AAGGGAACCG AAATTTGTTG TAATTTTTTA GACATCACTC TTCTCCTTTC CCAAGTTTCC 2160 

ACCAGCCATC AAGACACCAA GTTCTTGTTT ATTGGTTGTT TCTGGTGATA CAATACCTTG 2220 

AATCTTACCA TCGTGGATAA CGGCAATACG GTCTGAGACG TTTAAAATCT CATCCAATTC 2280 

AAAGCTGACA ACAAGGACAG CCTTGCCATT ATCACGCTCT TCAATCAAGC GTTTGTGGAT 2340 

ATACTCAATG GCACCGACAT CCAACCCACG AGTTGGCTGG CTAACGATAA GGAGATCAGG 2400 

ATCTCGATCA ATTTCACGAG CAATAATTGC TTTTTGTTGA TTTCCTCCTG AGAGTGCAGC 2460 

TGCAGGAACT AATTCACTGG CAGCGCGAAC ATCAAACTCT TCCATCAGCT TTTTAGCATA 2520 

AGAAGTAATA TTTGAATAAT TCAAAATTCC ATTTTTACTA TGTGGTTCTT TATAGTAGGT 2580 

TTGAAGGGCA ATATTTTCAG ATATCATCAT TTCCAAAATC AAGCCATCAC GGTGACGGTC 2640 

TTCTGGAACG TGCCCAACAC TTAGTTCTGT AATCTGACGT GGGTGCAAGC CTACAATTGA 2700 

ATCTCCTTTT AGCTCAATGC TACCAGATTC AACCTTACGA AGACCTGTAA TGGCTTGAAT 2760 

CAGTTCAGAC TGACCATTTC CATCAATCCC CGCAATACCA ACAATCTCTC CAGCACGAAC 2820 

ATCCAAGGAC AGATTTTTAA CAGCTGGAAC ACCACGGTTT TCATTGACCA CCAAATCTTT 2880 

GATAGACAAA ACCACTTCTT TTGGTTTAGA GGCTTGCTTC TCTGTTTTAA AGGAAACAGA 2940 

ACGTCCTACC ATCATTTCCG CCAAATCAGC ATTGGTAGCC CCTGCAATTT CAACGGTTTC 3000 

AATTGATTTC CCACGACGGA TAACTGTAAC ACGGTCAGAA ACTGCTCGAA TTTCATCCAA 3060 

TTTGTGGGTA ATCAAGATAA TTGATTTTCC TTCTTTGACA AGATTTTTCA TAATAGCCAT 3120 

CAACTCATCA ATTTCTGATG GAGTCAAAAC AGCCGTTGGT TCGTCAAAGA TAAGGATATC - 3180 

AGCCCCCCGA TAAAGTGTTT TTAAAATTTC TACACGTTGT TGGGCTCCAA CTGAGATATC 3240 

TGCTACCTTG GCAGAAGGGT CAACAGCTAA GCCATAACGT TCAGAAAGAG CCTTGATTTC 3300 

TTTGCTAGCT CCAGCGATAT CTAGCACACC ATTTTTAGTC AATTCACTAC CTAAAATGAT 3360 
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GTTTTCAGCC 
CAAGCTAGCT 
ACCACTAGTT 
ATTTTCTCCT 
GGCAACAAAT 
CATGTGCTCT 
AGCAAGCTTT 
GCTTCCTAAG 
ATTTTAGCTT 
AAGTCAACCC 
CTTTCTGCCT 
AGAACAAAGT 
CGATCAACAC 
GCCTCTGCAA 
GCTGCGTATT 
TAGTCAACTT 
TCAAAACGAG 
GTTGTTTTTG 
CTCGCAACAT 
TGTTCTTTTG 
TTGTAACTTC 
AAGTAAGTGA 
TCCCAAGCTG 
GCTTTTGTCT 
GCAAGTCCAA 
ACTGAACCTC 
GACCGTGCCA 
ATTCCCCGAC 
CTTGTTCAAA 
CAGTTTGACC 



ACTGTGAAGG 
GCTTTAGATG 
GGTTCAAGAA 
AAAAGTGCAT 
CCACCAAACA 
TCCTTTCAGA 
ACTTAGACAA 
AAATGACTTC 
TTGCATCTTC 
CTTTATCCTT 
TGTTAGAAAT 
TTGATTCTTT 
CGATAACCCA 
AGACACCTGC 
GTGCGGCTGC 
GGACTTTGAT 
AGATAACTTC 
CTGCAGCCAC 
TCTTTTGGTC 
CTGCATCTTT 
CAGCCGCTTG 
AACCGTTATC 
ATTGGTTGAA 
TCACATCAGA 
CTGCTGCCAC 
CTAAATAAGA 
CTCAGAGAGC 
CGTCATCTCG 
AAATTCAGCC 
ATAGACAATC 



CTTCAACCAA 
GGGAGTCGAG 
GGCCTGCTAA 
GAATTTCACC 
CCTTGGTAAT 
GTCTTATTTT 
AATGACTTTG 
CATCCATTAT 
GACAGCTTTT 
CAATGAGTAA 
ATCTTTTACA 
GCCATCTTTA 
AACTTTTTCA 
ACCTGTACCA 
AATTGTTTTA 
AGATGGGTCT 
AGATTCGATA 
ACCTGCAAGg 
TTTAATCACA 
AACTGCATTA 
TTGCAAGTTG 
TTTTGAAAGA 
TGATTTGTCA 
AGATGAAGCT 
TGCAACTAGG 
TGTGCAACGA 
GACTCAGACT 
ACCGTCGATT 
ATCACTTGGC 
AATTCTGAAA 



564 
CATAAAGTGC 



ATTGACAACT 
CATGTTCATT 
TTTTCGTAGG 
ATCACGCATC 
ATTTCAATAA 
TCTCAACTCT 
TTTTCAGGAA 
TTACCTTCTT 
ACGATCACTT 
GTTGTACCAA 
GAAGTGTATT 
TTTTCAGGAC 
CCAGCTACTT 
CCTTTAGCCG 
ACTGACGCAA 
CCACCTACAA 
TAACCTGACT 
TCATCAATCA 
TTAAGGGCAA 
TTAGCGTAGT 
TTGTGTTCTT 
TCAACACCAC 
GCGTTACGAG 
CCAAGACCTA 
TGTTGCAAGT 
AGTTTAAGTC 
TATCTTTTGC 
GACAAGCACC 
ATTCTCTTTG 



TGGTGAACCA 
TGACCGTTGA 
AGCGTGGACT 
TGCAAGTTGA 
TCAATGACAT 
AACTTGCTAG 
TAAAAAAGCG 
CTTTTACGCT 
CTGAAAGGTT 
GACCGCCAGG 
CTTGTTTCAA 
TACCTTCTGC 
GGCTTTCGTT 
GGTAAACAAT 
CATCACCAAA 
CACCAGCCTT 
AACCAACTTG 
CATTATCAGC 
AGACATAGTT 
AACCAACACC 
CAGCTTCACT 
TACCCCAAGC 
CAGTATCAGT 
AAGAGCGGTT 
GCCATTGTTT 
ATGGATTGGT 
TGTAAAAGAG 
GACTAAGGTC 
ACATGGCGAG 
GCCTTCAGAT 



TCCCGATTCC 
CCGCGATTTC 
TACCAGCCCC 
TTTTGTCGTT 
TTTCGTGTGC 
TTTGTCTAGT 
GCCCTTGGCC 
TCCATCAAGG 
TGTTACTGCC 
GAATTCTCCT 
AGTAGATACA 
TTCTTGGTCA 
GAGAGATTTT 
ATCTGCACCG 
TGAACCAGCG 
GAATCCTGCT 
TTTTGTCTTA 
GAAAGTTACG 
CAAGTCAGTG 
GAAGATTAGG 
TGTTGATTGG 
CTGCAAACCT 
GACGATTGCT 
ACCACATGCA 
CTTGTTCATT 
TGGCCACAAG 
TATGGAAGTA 
ACTTTTAGAT 
ATCGGTTTTT 
ATAGCCTTAA 



3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
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AAATAGCTGT 
CCGTGTAAAC 
GGACATAGGC 
GCCAATTCTC 
TCGACAAAGG 
TCAGATCCAA 
AAGCCAGTAG 
ACTTTTTCTT 
CTTGCTTCCA 
GCTCCAACAT 
TCAAATGCTT 
TTAACATCTG 
ACACTGGCAA 
GCATCTTGTT 
CCATTTAGGA 
CACATTTTTG 
CAACAATTTC 
CATAGTCCAA 
CAAAGTCCAT 
CATGAGCTAC 
TTTCCTCAAA 
TTTTGTTAAC 
TATCCTGACG 
CAATCGCTCG 
CAACCTTACC 
TGAAGGCACC 
TCTTGCTCAT 
GAAGGGCATA 
CTCCAATATC 



TCTCTCACCG 
ACTTCCGTCT 
ATGTTTGCTG 
CTTTTAAAAT 
CAAGAGCATC 
CTGTTTCACG 
ATGTTTTGAC 
GGTCTGTCAG 
CTACTGCGCG 
TGATCACCAT 
TCACGGCTGA 
TGCCTTCAAG 
AGTCATACTC 
TTAAAAGCGT 
GATGATTTCT 
GAAATCTGTA 
ACCCTTTTGA 
GGCATCAGAC 
AGCTGGAAGA 
ATTTACAGGA 
CTTAGCCAGT 
ATTTGCCAAA 
TCCTTGACCT 
TCCCAAAGGC 
AAGATCTACC 
CTCACCGACA 
CACCGAACTC 
GAGAAGCTTA 
CTGAACCTGA 



CAATTGGTCA 
TTAGCTACTA 
GTTTCAATTG 
AGCTACCCCA 
TGCATAAGAA 
CATTAATGTA 
AAAGTCAGCC 
AAGGCAAGCT 
AATATCTGAC 
ATCAATCTCA 
AGTTGTTGCT 
TCCTTTTTTA 
TCTAGCCTCA 
ATGATCTATA 
ACAATTTCAC 
ACTAGTTGAG 
ACGGAGTCTC 
TTAACTGCAC 
GCTGAAATGA 
CGATAGAGGT 
GCTTGACCAT 
CCAAGCATAA 
TGCAAAATCT 
TGGCTCATAT 
ATAGTTTGAG 
GTCACGTCTA 
GCAATCAAAG 
TCTGCTTTGA 
CGAATAAAAT 



565 
AAGGATAGCT 
AAACTGCTCC 
CCAGTTCAAT 
GCTGACGTTC 
CGAGCTCCAC 
ACATCTGCTA 
CCAGCTTTTT 
TCAATAATGA 
TCAACCAAGG 
TCTGCACCAT 
CCCAAAGGGA 
GCATGTTCAA 
GACAACAAAC 
TATTTATTTA 
GGATTTTTTT 
GTGGAATTTT 
CAATCTTCTT 
GACCAGCACC 
CACCCGTTTC 
CTTCCAAGTC 
TCTCAAGATG 
TTTGAGCCAA 
CCAATGCTTC 
CCGTAATCAC 
CCAACTCACG 
GC AAAATAGC 
GAATCGTGTC 
CCAGCTGGTC 
CCTCTTGACT 



AGCATTTTCA 
GATAGGAAAG 
CAACTCAGTA 
CGATACGGGT 
CGGCGGCCTT 
TCGTAGCACC 
GGGCCAATTG 
CTTTCACTAA 
CTAAATTACC 
TTTGGATAGC 
AACCTACTAC 
CCCAGGTCGG 
TATCAATTTG 
ATTTCATTTC 
CACTTCATCA 
TTCATTTGTG 
TTCAAAAACA 
CAGCCTCATG 
CTGAGCAGGG 
TCCACCTTGG 
TTGGTGAACT 
TTCACAAATA 
AAGGATTTCC 
TGCTACTGTC 
CGCCTCATCA 
ATCCGCCCCT 
GACAGTTGCG 
TGATTGCCCA 
ACGTTCTACT 



ATATTCACTC 
TGAGAATAGG 
GTCGCCATCT 
CGCACCTGCT 
GACACCCATA 
ACCAGTTGAA 
GCAAACAACA 
CTTATCACCA 
TGATTTGAGA 
TTCTTTTGTC 
TGTGCAAACC 
ATTAACGCAA 
TTTTTTCTTT 
GGTTTTCCCT 
CTTATTTTAA 
TATACTTTTG 
ATTCCTGTTT 
GCATAAAGAC 
ATTTCCACCA 
GCTTGCACCA 
TCTTCAACAG 
AAGTGGGTAA 
AGACGATTTC 
TTCCGTCCAA 
ACCGTCTTCA 
GCCGCAATTT. - 
GTCACATCAC 
ATGACAGATA 
TGATAGCCCT 



5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 
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TAATGGACTC 


CAATTTATCA 


ATTGTTCCGC 


CTGTATGGCC 


AAGACCACGA 


CCACTCATTT 


6960 


TTGCTACAGG 


CACACCGAAG 


CTAGCAACAA 


GAGGAGCTAA 


AATCAAGGTT 


ACCTTATCGC 


7020 


CGACACCACC 


AGTAGAATGC 


TTGTCAACTT 


TCACACCATC 


AATGGCTGAC 


AGGTCAAACT 


7080 


CTTGCCCAGT 


CTTAACCATA 


TTCATCGTTA 


AATCAGAGAT 


TTCTCGAGTC 


GTCATTCCTT 


7140 


TAAAATAAAC 


AGCCATAGCA 


AAGGCAGACA 


TCTGATAATC 


AGGAACAGTT 


CCTGATACAT 


7200 


AGCCTTCTAT 


CAGCCATTCA 


ATTTCACTTG 


AAGTCAGTTC 


TTGACCGTCT 


CGTTTTTTTT 


7260 


GGATTAAATC 


AACTGCTCTC 


ATTCTTTCAC 


ACTTCTAAGG 


ATATAGTATC 


CCTTGTCTTT 


7320 


TTTAAGGATT 


TCACAATTGC 


CAAACACATC 


TTCCATCTTA 


GACTTGGCAC 


TTGGAGCTCC 


7380 


TTGTTTTTTC 


TGGATGACGA TGGTCAAATC 


TCCACCAATT 


TCCAAGAAAT 


CTTTACTTTT 


7440 


CTCGATGATT 


TCATGAACGA 


CTTGCTTGCC 


CGCACGGATA 


GGAGGATTGG 


AAATGACATG 


7500 


GTCAAATCGC 


CCTTGAACTC 


TTGCATAAAT 


ATTAGATTGA 


AATATCGTCG 


CTTTTGCATT 


7560 


ATTTTTTTCA 


GCATTTCTCT GAGCTAAATC 


CAGGGCACGA 


GTGTTAATAT 


CAACCATGGT 


7620 


CGCCTGAACT 


CCGTAAACCT 


TGACCAAGGA 


CAAACCTAAT 


GGACCATAAC 


CACAGCCTAC 


7680 


ATCTAGGACT 


GTCTCTCCTT 


GGTTGACATC 


CAGACACTTG 


AGCAAGAGTT 


GACTTCCAAA 


7740 


GTCAACCATT 


TTCTTGCTAA 


AAACACCCGC 


ATCTGTCAAA 


AAAGTCATTT 


TTTCTCCCAA 


7800 


CAAGTCCACT 


CTCAACTCAT 


GAATGTCGTG 


AGCAGCGTCA 


GGATTTTCTG 


CATAGTACAT 


7860 


TTTACTCATG 


ACACTATTTT 


ACCATAATTT 


GACTCAAATT 


GTAAATCGTT 


TACAAATTGA 


7920 


TAATAAAACG 


AAAAAGACCG 


AAGAAAGCAA 


GTCACGAAGC 


CATTTTCTTC 


AATCTCTTTC 


7980 


AACACTTATA 


AATAATAAAC 


CATTTAGAAC 


TATAAATATC 


ACAGTCCAGA 


TAAAAACAAA 


8040 


AAGTTTATCA 


TCTATAATCA 


GGCAGATTAT 


TATTTCTATT 


GCTTAACCTT 


AAAATACTTT 


8100 


ATTATCAACA 


AAATTCCTAA 


CAAAATGTTT 


AGATAAAAGC 


CCAACTGATA 


CGTTTATGTC 


8160 


AGGATTTCCA AACTTGTCCA AAGTCGTATC AAATCTTCTA GTGACATGTG 


GAAGAAATAA 


8220 


CCCTCTGTCG 


CAATCCGTAG 


GACTAAAAAG 


CAATAACTAC 


CCGCAGCAAT 


CCATTTCGTC 


8280 


CATCGTTTTT 


TAGTAAGAAA 


GCAATTAAGA 


ACGAACAAAT 


AAAGACAGCT 


GTTACAATAG 


8340 


CATGTTCCAT 


CAAAAAAGTA 


AAACCGTAAT 


AGGTTTCCAC 


AAAGCATCTA 


CCATTATCTG 


8400 


CATTGGTTCC 


TTTTATAAAA 


GGTAAAGCAA 


AACTTAAAAT 


AAAACAGAGT 


TCCAATATGT 


8460 


AACGTTTTAA 


GATTTTCATA 


GTACACCTCC 


TATAAGTTGT 


GAACTAAAAA 


GCCCCCTTTA 


8520 


TAAGCTTATA 


AATCAGTAGA 


ATCTATCTCC 


TATTTCATCA 


ATAAATTGAT 


CACTTATACT 


8580 


ATATACCATT 


GACTTACCAC 


ATTCAAGAAA 


CCGCTTTATT 


TTTTTAGCTT 


TTTATGGTAT 


8640 


GATAGACAAA ATATCTAGGG 


GAAAACAAAT 


GACCAACGAA 


TTTTTACATT 


TTGAAAAAAT 


8700 
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CAGCCGCCAG ACTTGGCAAT CTTTACATCG AAAGACAACA CCTCCTTTGA CAGAAGAAGA 8760 

ATTGGAATCT ATCAAGAGTT TTAATGACCA AATCAGTCTC CAAGACGTTA CAGATATCTA 8820 

TCTCCCCTTG GCTCATTTGA TTCAGATTTA CAAGCGAACT AAGGAAGATT TAGCCTTTTC 8880 

AAAAGGAATT TTCCTCCA 8898 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13188 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 



TATCTTAACG 


aGGATTGGGT 


TTATCGTCAG 


TCTTATTGCC 


CTAATTGTGG 


GAACAATCCC 


60 


TTAAATCATT 


TTGAAAATAA 


TCGGCCTGTA 


GCAGATTTTT 


ACTGTAATCA 


TTGTAGTGAG 


120 


GAGTTTGAAC 


TAAAGAGCAA 


AAAAGGAAAT 


TTTTCATCAA 


CAATCAATGA 


TGGTGCTTAT 


180 


GCAACGATGA 


TGAAGCGTGT 


GCAGGCAGAT 


AATAATCCTA 


ATTTCTTTTT 


TTTAACTTAC 


240 


ACAAAAAATT 


TTGAGGTAAA 


TAACTTTCTT 


GTCCTTCCGA 


AGCAATTTGT 


TACACCGAAA 


300 


TCGATTATTC 


AAAGAAAACC 


ACTTGCACCA 


ACTGCTAGAC 


GAGCAGGTTG 


GATTGGTTGT 


360 


AACATTGATT 


TATCACAAGT 


ACCTTCTAAA 


GGAAGGATAT 


TTCTTGTGCA 


AGATGGACAA 


420 


GTTAGAGATC 


CAGAAAAAGT 


TACAAAAGAA 


TTTAAGCAAG 


GTTTATTTTT 


AAGGAAGAGC 


480 


TCTCTGTCAT 


CAAGAGGTTG 


GACAATAGAA 


ATTCTAAATT 


GTATAGATAA 


GATAGAGGGT 


540 


TCAGAATTTA 


CCCTTGAAGA 


TATGTATCGT 


TTTGAAAGTG 


ACCTAAAAAA 


TATCTTTGTT 


600 


AAGAACAATC 


ATATCAAAGA 


AAAGATTAGG 


CAACAGCTTC 


AAATATTAAG 


AGACAAAGAA 


660 


ATAATAGAAT 


TTAAAGGTAG 


AGGAAAGTAT 


CGGAAATTAT 


GAAAACGAAA 


CAACTTGTTG 


720 


CATCAGAAGA 


GGTGTATGAT 


TTCTTAAAAG 


TCATCTGGCC 


TGATTATGAA 


ACTGAAAGCC 


780 


GTTACGATAA 


CCTAAGTTTA 


ATCGTCTGTA 


CCTTATCAGA 


TCCCGATTGT 


GTGAGATGGT 


840 


TATCTGAAAA 


TATGAAATTT 


GGTGACGAAA 


AACAACTAGC 


TTTGATGAAG 


GAAAAATATG 


900 


GGTGGGAAGT 


AGGAGATAAA 


TTGCCAGAGT 


GGCTACATAG 


CTCCTATCAT 


AGATTATTGT - 


960 


TAATAGGTGA 


ATTATTGGAA 


AGCAATCTAA 


AACTGAAAAA 


GTATACAGTA 


GAAATTACAG 


1020 


AAACTTTATC 


ACGTTTAGTA 


AGTATAGAGG 


CTGAAAATCC 


AGATGAAGCC 


GAACGACTTG 


1080 


TAAGAGAAAA 


GTATAAGAGT 


TGTGAAATTG 


TTCTTGATGC 


AGATGATTTT 


CAGGACTATG 


1140 
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ACACTAGCAT ATATGAATAG GTAGATGTTT TTATTTTGTC AACAAAAAAG AGGCTCGCAC 1200 

CTCTTTTTCT TATTTCTTTT TATGATTTAA TACGGCATTG AGGACAATAG CGAGTAGGCT 1260 

GGCTACGACG ATTCCGTTTG AGAAGAACAT TTGGAAGGCT GTCGGCATGC TGACAAAGAG 1320 

ATTACTGTTG TTGAGACCGA CACCTGCAGC GATTGAAACA GCTGCGATAA GGAAGTTGTG 1380 

TTCATTGTTA GCAAAGTCAA CACGGGCGAG GATTTGCATC CCTTGAATTG ATACAAAACC 1440 

AAACATTACC AGCATGGCAC CACCGAGGAC GGAGCTTGGA ATGATTTGGG CAAGGGCGCC 1500 

AAACTTAGGA AGCAGTCCAA GGAGAACCAG GAAACCAGCT GCGTAGTAGA TTGGCAGGCG 1560 

TTTTTTGATG CCTGACAATT TAACCAAACC AACGTTTTGT GAAAATCCGG TGTAAGGGAA 1620 

GGTGTTAAAG ATTCCTCCGA GAAGTACGGC CAAACCTTCT GCGCGGTATC CGTTGCGAAG 1680 

GCGCGTGCTG TCGATTGGAT CCTTTGTGAT ATCAGACAAG GCCAGATAAA CACCAGTTGA 1740 

CTCAACCATA GACACCGTTG CGATGATACA CATCATGACA ATAGATGAGA TTTCAAAGGT 1800 

TGGCATCCCA AAGTAGAGTG GAGTTGGGAC ATGGACAAGT GGAGCTACCG CAACAGGAGA 1860 

GAAGTCCACC AAGCCCATAG TAGCAGCAAT GGCAGTTCCA ACAACCAGAC CAATCAAAAT 1920 

AGAGATAGAC TTGATAAATC CTTTGGTAAA GATGTTGATC AAGAGGATAA TCAGAACAGT I960 

AATAGCTGCA AGCAAGAGAC TTTGACCAGT TGGCTCTGGA ACGTTATTTC CCATATTTCC 2040 

AATAGCGACA GGGATCAAGG TTAAACCAAT CGTGGTAATA ACAGATCCTG TTACGATAGA 2100 

TGGGAAGAGA TTGGCTACTT TTGAGAAGAT GCCTGAAACA AGAACCACGT AAATCCCAGA 2160 

TGCGATAAGG GCACCAAACA TAGCGCCACT ACCATGGCTT TGCCCAATCA TAATCAAGGG 2220 

AGCGACCGAC TGGAATGCAA CTCCAAGAAC GACTGGGAGT CCAATCCCAA AGTATTTGTT 2280 

GAGTTGGAGT TGGAGGAAGG TTGCCACCCC ACACATGAAG ATATCTGTAG AAATCAGGTA 2340 

GGTCAACTGC TCAGCTGAAT AGCCAAGGGC TGTCGCAATC ATGATGGGAA CCAGGATAGA 2400 

TCCTGAGTAC ATGGCTAGTA AGTGCTGCAA GCCAAGAACG GCTGCTTGCG AGTGTTTTTC 2460 

TTGAGTTTGC ATTAGAGATC TGCCTCCTTA AATACGACTT GACCATTTTC AAAACAATCC 2520 

AAACGAGCAA GTGATAGGAC AGGGTAGCCT GCTTTTTCAA GCAAATCACG ACCATCTTGG 2580 

AAGGATTTCT CAATCACGAT ACCGATAGCT TGGACTGTGG CACCGGCCTG TTCGATGATT 2640 

TGAATCAAGC CTTTAGCAGC TTGGCCATTA GCAAGGAAAT CGTCGATAAT CAAAACCTTG 2700 

TCCTCTGGTG AGAGGAATTT TTCAGCGATA GAAACGGTGC TGGTCACCTG CTTGGTAAAG 2760 

GAGTAGACTT GAGCAGTTAA GATGCCTTCG TTCATGGTGA TGTTCTTAGC TTTTTTGGCG 2820 

AAAATCATGG GAACGTTTAA GGCTTCAGCT GTAAAAACGG CTGGGGCAAT ACCCGACGCT 2880 

TCAATGGTTA CGACCTTGGT AATGCCAGTA GTAGCAAATT TTTCCGCAAA AACCTTACCA 2940 
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ATCTCTCGCA 


TCAAGCTAAA 


GTCAACTTGG 


TGGGTTAAAA 


AGGAATCTAC 


CTTGAGGATG 


3000 


TTATCACCCA 


AGATATGCCC 


ATCCTTGAGG 


ATGCGCTCTT 


CTAATAATTT 


CATAAGACCT 


JUOU 


CCTAAAGTCT 


AAAAGTTAAT 


TTACTTGTTG 


TTTAAATATT 


TCTATAGTGA 


TCCCTTTTGC 


3120 


TAATACTATA TATTTGATAA AACTATTACG AGCGAAGCGA GTCTTATCAA ATATTTCCCG 


3180 


TTGTAGTGGT 


ATCATAGACA 


ATAATCTTGT 


TATTGTCTAT 


GACGGGATTT 


TTGAGAGTAA 


3240 


AATAGTTCGG 


GGAACTATTT 


TAGCCTAAGC 


CTAGAAATGA 


AAGAGCTAGG 


GGCTCAAAAA 


3300 


TTAGGGATGA AATTCCCTGG ATTCCTGAAA TTATTCACAG GATAATTTCA 


CCTCCCGTCC 


3360 


GCACTAATTA 


AGGGAAATAT 


TAAAAAAAGA 


CCTACTTAAT 


CTCTAAGTAA 


GTCCCCTAAA 


3420 


TAGACATGGC 


AAAAACGGCC 


ATATCTCACT 


GCTGACTTAC 


TTATTGTTAG 


GTGTTCCGGC 


3480 


ACCTTGTAGA 


AACGTCGTGC 


CAATTCACGA 


CATAAACAAG 


TAAAACGATA 


TTCAATTTTA 


3540 


AATAGGCTTG 


AGCCAATGTT 


TTTATTTTAC 


ACTAAATAAC 


TTTAGAAATC 


AACTATTTTG 


3600 


TTAGTGTTTT 


GGTTTAAAAA 


ACGAACAAAA 


AGAAGAGAGG 


GTGAACAAAA 


ACTCCATTGT 


3660 


AAGCTAACAG 


TTATACTAAA 


TGAAAATCAA 


AGAGCAAACT 


AGGAAGCTAT 


CCACAACCTC 


3720 


AAAACACTGT 


TTTGAGGTTG 


TGGATAGAAT 


TGACAGAGCC 


AGTATCATAT 


ACCTACGGTA 


3780 


AGGCGACGTT 


GACGTGGCTT 


GAAGAGATTT 


TCGAAGAGTA 


TTAGAAGATT 


TTTCCATCAT 


3840 


AAAAGGCATA 


CTATCAAGCT 


TTTAGACACC 


TGACAATATG 


CCTTTTTCTA 


ACTTTAAAGA 


3900 


CTTTTCCCAA 


TTTTTATTAT 


TCTACTCGCT 


AAATCTTAAA 


AAATAGCCAT 


CTGGATCCAA 


3960 


AACTGCAAAT 


TTATGAGGAT 


AGATATAGGG 


ATCACTGACA 


CGAAACTTTC 


TTTTGGTCAA 


4020 


GGGACGATAA ATAGGATAGT 


TTGCCTTCAT 


CACTCTTTAA 


TAGAGTTTTG 


AAACATCCTT 


a no n 


TATGCCAAAG 


GAGAGATTGA 


CTCCACGACC 


AAAGGGATAG 


GTCAGTTCAG 


CTAGTTGATC 


A 1 A ft 


CTTTGTTCCC 


TCCTCTAACA 


TTAGTTGACA 


CTCTTCAAGA 


GAAAGAGAAA 


GTTTTCTTCT 


4200 


GGACGTTGGT 


ATTCAATCCT 


AAAACCCAGT 


AAACCACAGT 


AGAAGGACCG 


GGACTGTTCG 


4260 


ATATTCGATA 


CAAGCAACTC 


GGGAATGACC 


GCATTGTAGT 


CCATATAGAA AATCCTTACA 


4320 


AGTCAATTTC 


CAAGACAATC 


GGTGTATGGT 


CTTGGCGAGC 


ACCTGAGTCA 


ATCATATCAG 


4380 


ATTTAGTGAC 


CTTGTCAGCG 


ATACGGTTAC 


TTGTGAGCCA 


GTAGTCGATT 


CTCCAGCCTG 


4440 


TATTGTTGAT 


TTTAGAAGTT 


TTGCTGCGTT 


GTGCCCACCA 


AGTGTAGCGT 


TCAGGAACAT- 


4500 


CGCCATGAAC 


ATGGCGGAAG 


GTGTCTGTAA 


ATCCAGTTGC 


CAAAAGGTTG 


GTAAATCCAG 


4560 


CACGTTCCTC 


GTCAGTAAAT 


CCAGGTGAAC 


GGCGGTTGCT 


AGCAGGATTT 


GGAAGGTCGA 


4620 


TTTCATTGTG 


GGCTACGTTG 


TAGTCACCGG 


TCGCAAGGAC 


TGGTTTTTCT 


TTGTCTAGTT 


4680 
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CAGCCAAATA 


CTCAGCATAT 


TTGGCATCCC 


AGACTTGGCG 


TTCTTCCAAG 


CGTTTGAGAC 


4740 


CGTCACCAGC 


GTTTGGAGTG 


TAAACTTGGG 


TTACGAAAAA 


TGCATCAAAT 


TCTAGAGTGA 


4800 


TGATACGACC 


TTCCAAGTCC 


ATGGTAGAAG 


GGGCACCGAT 


TTCTGGGAAG 


CTGATAGTAG 


4860 


GTGTAAGTTC 


TTTCTTATAA 


AGGAACATGG 


TTCCAGCATA 


GCCTTTACGG 


GCAGGCTCTT 


4920 


GGGAAGAGCG 


CCACGTGTTT 


TCGTAGCCTG 


GGAAGAGTTC 


TTCTAAAATT 


TCCACGTGTT 


4980 


TCTTTGTAGG 


TCCTTTGGCA 


GAAAGCTTGG 


TTTCTTGGAT 


AGCAATGATA 


TCAGCATTTT 


5040 


CAGCGACCAA 


GGTTTGTAGG 


ACTTCTTGGG 


ACAATTTGGC 


ACGAGCTGAG 


TCACTAGTTA 


5100 


GGGCAGCGTT 


TAGGGAATCA 


ATATTCCATG 


AGATAAGTTT 


CATAAAGTTA 


CCTTTTTCAT 


5160 


TCAGATTATA 


GATTTTATTA 


TACCAAAAAA 


AGATCTATTT 


CCCCAACGTA 


TGGTTTGAAA 


5220 


AATTACTCTC 


TTTCGTTTAT 


AATTAAGAAT 


GATTTTATGA 


AAGGGAGTGA 


AAATACATGA 


5280 


AATTCTACTC 


TTATGACTAT 


GTACTCAGCC 


AAATCGGTCA 


GCAAAATGGT 


ATCATGGTTG 


5340 


GCTTTGGGAT 


TGTTCTATTA 


GCTGTGACAG 


TTTTTTTTGC 


TTTCAAGGCA 


TACCATAATA 


5400 


AAAAGGGAAG 


CGAATTTCGT 


GAGTTGGTCA 


TGATTTCAGA 


TCTGGCCTTA 


TTTAGCTCTG 


5460 


CTTTTGGTCA 


GCATCACGAC 


TTATCAAAAC 


AATCAAGTTT 


CTAACAATAA 


ATTTCAAACT 


5520 


TCACTTCATT 


TCATCGAGGT 


TGTTTCCAAA 


GATTTGTGAG 


TAGACAAGTC 


AGAAGTCTAT 


5580 


GTTAATACTT 


CCACAAACAC 


AGATGGCGCA 


CTTATCAAGG 


TGGGAGATCG 


CTATTATCGT 


5640 


GCCCTAAATG 


GAAGTGAGCC 


AGACAAGTAC 


CTGTTAGAGA 


AAGTCGAATT 


GTATAAGACA 


5700 


GACGCAATTG 


AACTGGTGGA 


TGTGAACAAA 


TGACACTTAA 


TTATATCGAA 


ATTTTAATCA 


5760 


AACTGGTCTT 


GACTCTCAAA 


TAGCTCAACA 


ACAATGTTCA 


CTTTGTGAAA 


CGTTTGATTG 


5820 


ATGGTAAGCC 


AACTCTCCTT 


ATCAAAAATG 


GGAATATTGA 


CCCAGAAGCC 


TGTCGTTCAG 


5880 


TTGGTTTGTC 


TGCATCGGAT 


GTATCCCTCA 


AACTTCGTAG 


CCAAGGGATT 


TTCCAGATGA 


5940 


AGCAAGTCAA 


ACGAGCTGTG 


CAAGAGCAAA 


ATGGGCAACT 


CATCGTTGTG 


CAAATGGGAG 


6000 


ATGAAAATCC 


TAAGTATCCA 


GTTGTGACTG 


ACGGTGTGAT 


TCAAGTAGAT 


GTCTTGGAAT 


6060 


CGATTGGTCG 


TAGCGAAGAG 


TGGTTGCTTG 


ATAACCTCAG 


TAAACAAGGG 


CATGACAATG 


6120 


TAGCCAATAT 


CTTTATTGCT 


GAATATGACA 


AGGGTGCTGT 


TACAGTCGTA 


ACTTATGAAT 


6180 


AAGAAAAACC 


TGGGGTCTTG 


TACTCTTCGA 


AAATCTCTTC 


AAACCGCGTC 


AACGTCGCCT 


6240 


TGCCGTATGT 


AGGTTACTGA 


CTTCGTCAGT 


TCTATCTACA 


ACCTCAAAGC 


AGTGCTTTGA 


6300 


GCAGCCTGCG 


GCTAGTTTCC 


TAGTTTGCTC 


TTTGATTTTC 


ATTGAGTATT 


GGCCTCAGGT 


6360 


TTCCATTTGC 


AATCAGAAAG 


GGATTTTATG 


TCCATTATTC 


AAAAACTTTG 


GTGGTTTTTC 


6420 


AAGTTAGAAA 


AACGCCGTTA 


TCTAGTCGGA 


ATTGTGGCCC 


TGATCTTGGT 


TTCCGTCCTC 


6480 



